Index

Come back here any time by clicking on --Go to Index-- links throughout the page.

Exploratory data analysis of all variables

Goal

The goal of this notebook is to explore correlations among variables in a dataset, in order to select those features that merit further exploration.

This is a shotgun kind of analysis, used as the basis for further exploration in quick visualization tools such as Tableau, and as a starting point for modeling decisions: feature selection, further feature engineering, model selection, and so on. See this example of the kind of exploration that stems from the results found here -- like this one there should be several that may not need to end up documented.

Sections

Example of how to use the data presented in these tables

Number of notifications vs outcome

Analyses by types

1. Categorical relationships effect size heatmap

2. Numerical relationships effect size heatmap

3. Numerical and categorical relationships effect size heatmap

Automatic analyses done by three functions

Summarize categorical

Summarize numerical

Summarize numerical categorical

The goal of these functions is to quickly assess whether there is a relationship between variables. So, for each variable:

  • A summary is presented: summary statistics, value counts, contingency table.
  • A plot, should the number of categories be small enough (less than 20)
  • A p-value for the null hypothesis that there is no difference between the variables: chi^2, correlation, or Kruskal-Wallis H-test for independent samples or paired t-tests.
  • A measure of effect size: Cramer's V, Pearson's R, eta^2, or Cohen's d.

Libraries and data

Libraries and Load data

Data

The data consists of a dataset of customer lookups for flights in app. Dataset which was cleaned and processed in a sepparate notebook and combined with data from freely available datasets, used mostly to explore markets and market penetration:

Some feature engineering has been done, mostly separating information contained in a single variable. This process is documented in a separate notebook.

See a sample of the working dataset with all the current variables.





In [1]:
from IPython.core.display import HTML
style = """
<style>
div.output_area {
    overflow-y: scroll;
}
div.output_area img {
    max-width: unset;
}
</style>
"""
HTML(style)
Out[1]:
In [1]:
# Import resources
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from itertools import combinations
%matplotlib inline

import warnings
warnings.filterwarnings("ignore")

# Display all columns in dataframe
pd.set_option('display.max_columns', None)
In [2]:
# Function to summarize categorical variables

def summarize_categorical(df, x, y, plot=True):
    
    """
    Presents contingency table, chi-squared decision, and Cramer's V
    for two categorical variables.
    """
    
    plot_adj = False
    if plot & (df[x].unique().shape[0] <=20) & (df[y].unique().shape[0] <=20):
        plot_adj = True
    
    
    print("\n"+"\n"+"------------ "+ x + " and " + y + " ------------" +"\n")
    
    # Value counts
    print(x + " : " + np.str(df[x].unique().shape[0]) + " unique values.")
    print(y + " : " + np.str(df[y].unique().shape[0]) + " unique values.")
    
    # Plot
    if plot_adj:
        plt.figure()
        sns.countplot(y=x, hue=y, data=df,palette='husl',
                  order=df[x].value_counts().iloc[:10].index);
        plt.show()
    
    # Contingency table
    contingency_table = pd.crosstab(df[y],df[x],margins = True)
    
    if (contingency_table.shape[0] <= 10) & (contingency_table.shape[1] <= 10):
        print(contingency_table)
    else:
        print("\n" + "Too many values to print contingency table.")
            
    # Chi-squared
    contingency_cols = contingency_table.shape[1] - 1
    
    max_val = df[x].unique().shape[0]
 
    contingency_obs = np.array([contingency_table.iloc[0][0:max_val].values])
    for i in range(1,contingency_table.shape[0]-1):
        contingency_obs = np.vstack((contingency_obs, contingency_table.iloc[i][0:max_val].values))

     
    chi_2, p_v, deg_fr = stats.chi2_contingency(contingency_obs)[0:3]
    
    # Cramer's V
    ## Get the smallest of (r-1) or (c-1), with r->rows, c->columns, of the
    ## contingency table observations.
    t = np.min([contingency_obs.shape[0] - 1,contingency_obs.shape[1] - 1])
    ## Get the number of samples
    n = df[x].shape[0]
    ## Calculate Cramer's V
    cramers_v = np.sqrt(chi_2/(n * t))

    
    # Print results
    print("\n"+"\n"+"X^2 TEST")
    print("\n"+ "p-value to reject null: " + np.str(np.round(p_v,4)))
    print("X^2: " + np.str(np.round(chi_2,2)))
    print("D.f: " + np.str(deg_fr))
    print("\n"+"\n"+"Effect size")
    print("Cramer's V: " + np.str(np.round(cramers_v,2)))
    
    return p_v, cramers_v
In [3]:
# Function to summarize numerical variables
def summarize_numerical(df, x, y, summary=False, plot=True):
    """
    Presents a first glimpse into relationship between two numerical variables: plots
    vs each other, Pearson's R correlation, and p-value for it.
    """
    
    print("\n"+"\n"+"------------ "+ x + " and " + y + " ------------" +"\n")

    # Summary stats
    if summary:
        print("\n" + "Summary for " + x)
        summary = df[x].describe()
        print(summary)
        print("\n" + "Summary for " + y)
        summary = df[y].describe()
        print(summary)

    
    
    x_org = df[x][~df[x].isnull() & ~df[y].isnull()]
    y_org = df[y][~df[x].isnull() & ~df[y].isnull()]   
    x_log = np.log(x_org + 0.01)
    y_log = np.log(y_org + 0.01)
    
    # Plots
    if plot:
        fig, ax= plt.subplots(2,2, figsize=(14,6))
        ax[0,0].scatter(x = x_org, y = y_org, alpha = 0.1)
        ax[0,0].set(title=x + ' original and ' + y + ' original', xlabel=x, ylabel=y)
        ax[0,1].scatter(x=x_log, y=y_org, alpha = 0.1)
        ax[0,1].set(title=x + ' log_transf and ' + y + ' original', xlabel=x +' log_transf', ylabel=y)

        ax[1,0].scatter(x=x_org, y=y_log, alpha = 0.1)
        ax[1,0].set(title=x + ' original and ' + y + ' log_transf', xlabel=x, ylabel=y +' log_transf')
        ax[1,1].scatter(x=x_log, y=y_log, alpha = 0.1)
        ax[1,1].set(title=x + ' log_transf and ' + y + ' log_transf', xlabel=x +' log_transf', ylabel=y +' log_transf')

        plt.show()
    
    # Correlation
    r, p_val = stats.pearsonr(x_org, y_org)
    
    # Print results
    print("\n" + " Pearson's R: " + np.str(np.round(r,2)))
    print("\n" + " p-value to reject null: " + np.str(np.round(p_val,2)))
    
    return p_val, r
In [4]:
def t_test(tb,x,y):
    vals = df[y].unique()
    
    # T-test paired
    c1 = tb[x][tb[y]==vals[0]].dropna()
    c2 = tb[x][tb[y]==vals[1]].dropna()

    t_stat, p_val = stats.ttest_ind(c1,c2)
    
    # Effect size Cohen's d
    numerator   = np.mean(c1) - np.mean(c2)
    denominator = np.sqrt((np.std(c1)**2 + np.std(c2)**2) / 2)
    cohens_d =  numerator/denominator
    
    # Give absolute value of Cohen's d
    cohens_d = np.abs(cohens_d)
    if cohens_d > 1:
        cohens_d = 1

    return p_val, cohens_d, t_stat

def kruskal_wallis(df,x,y):
    # Kruskal-Wallis
    ## Dictionary of numerical arrays for each categorical value
    cats = {}
    for i,cat in enumerate(df[y].unique()):
        cats[cat] = df[x][df[y] == cat].dropna()

    H, p_val = stats.kruskal(*cats.values())    
    
    # Effect size
    k = df[y].unique().shape[0]  # Number of groups
    n = df[x].dropna().shape[0] # Total number of observations
    eta2 = (H - k + 1)/(n - k)
    
    return p_val, eta2, H
In [5]:
# Function to summarize numerical by categorical variables
def summarize_numerical_categorical(df, num, cat='outcome',plot=True, summary=True, hist=False):
    """
    x: numerical variable
    y: categorical variable
    """
    
    x = num
    y = cat
    
    
    plot = False
    if df[y].unique().shape[0] <=20:
        plot = True
    
    print("\n"+"\n"+"------------ "+ x + " by " + y + " ------------" +"\n")
    
    
    # Summary stats
    if summary:
        summary = df[x].describe()
        print(summary)
    
    x_org = df[x][~df[x].isnull() & ~df[y].isnull()]  
    y_org = df[y][~df[x].isnull() & ~df[y].isnull()]
    x_log = np.log(x_org + 0.01)
    
    # Plots
    if hist:

        # Plot numerical variable distribution
        f, axes = plt.subplots(2, 2, sharex=False, gridspec_kw={"height_ratios":(.15, .85)}, 
                                            figsize = (12, 4))
        sns.boxplot(x_org, ax=axes[0,0])
        axes[1,0].hist(x_org)
        sns.boxplot(x_log, ax=axes[0,1])
        axes[1,1].hist(x_log)
        plt.show()    

    if plot:
        
        # Plot boxplots
        f, axes = plt.subplots(1, 2, figsize=(12,4))
        sns.boxplot(x = x_org, y = y_org , ax=axes[0]).set_title(x)
        sns.boxplot(x = x_log, y = y_org, ax=axes[1]).set_title(x + " log transf")
        plt.show()
        
    else:
        cat_vals = np.str(df[y].unique().shape[0])
        print("\n"+ cat_vals + " different categorical values, too many to plot." + "\n")

    # Tests: t-test if 2 groups; kruskal_wallis if 3 groups or more.
    vals = df[y].unique()
    if vals.shape[0] == 2:
        p_val,effect_size,statistic = t_test(df,x,y)
        
        test_name = "Paired t-test"
        effect_name = "Cohen's d"
        stat_name = "t-statistic"
        
    elif vals.shape[0] > 2:
        p_val,effect_size,statistic = kruskal_wallis(df,x,y)
        
        test_name = "Anova - Kruskal-Wallis"
        effect_name = "Eta^2"
        stat_name = "H"
    else:
        p_val = 1000
        print("Check the unique values of your categorical variable")
        
    
    # Print results
    print("\n"+"\n"+" " + test_name)
    print("p_val to reject null: " + np.str(np.round(p_val,4)))
    print(stat_name + " value: " + np.str(np.round(statistic,2)))
    print("\n"+"\n"+"Effect size")
    print(effect_name + ": " + np.str(np.round(effect_size,2)))
    
    return p_val,effect_size

Load dataset

In [7]:
df = pd.read_csv("df_ordered_filled_out_with_geography_population_and_passengers_carried.csv")
In [8]:
df.sample(20)
Out[8]:
transaction_id origin_city destination_city user_id trip_id trip_type departure_date return_date stay weekend filter_no_lcc filter_non_stop filter_short_layover filter_name status_updates first_search_dt watch_added_dt latest_status_change_dt status_latest total_notifs total_buy_notifs first_rec first_total last_rec last_total first_buy_dt first_buy_total lowest_total Use frequency outcome ordered session diff_day diff Search or watch first_buy - lowest_total days_to_departure latitude_deg_origin longitude_deg_origin continent_origin City origin latitude_deg_destination longitude_deg_destination continent_destination City destination region_origin region_destination country_origin country_destination Domestic or international Adult population Country name Capital origin name Count_uniq_users_per_country Percent unique users to country adult pop Country Code Region IncomeGroup Passengers carried Q1 Percent unique users to passengers carried by country
251039 501147 ORL CLE 38cd91530c4f7be82c440717b9cf3ea9f5f892ef3d1273... d4c27f01c5311fa4886dad35d1e4e99a3b9b2bd64d34cf... round_trip 2018-01-26 00:00:00 2018-01-28 00:00:00 2.0 1 0 0 0 NoFilter 1 2018-01-10 10:01:00 NaN 2018-01-10 10:01:00 shopped 0.0 0.0 buy 110.0 buy 110.0 2018-01-10 10:01:00 110.0 110.0 8 gained 4 2 0 0 days 12:00:00.000000000 search 0.0 15 28.545500 -81.332901 NorthA Orlando 41.411701 -81.849800 NorthA Cleveland FL OH US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
568128 831289 LHR DUB ad861b69926881aa7a806fc5fd57237f56ac9f0223f678... f2d2ac3035fa7e7480e81d2ca869020a9f029bf421bdf9... round_trip 2018-03-24 00:00:00 2018-03-25 00:00:00 1.0 1 0 0 0 NoFilter 1 2018-02-14 18:19:00 NaN 2018-02-14 18:19:00 shopped 0.0 0.0 buy 101.0 buy 94.0 2018-02-14 18:19:00 101.0 94.0 13 gained 10 4 0 0 days 00:01:00.000000000 search 7.0 37 51.470600 -0.461941 EU London 53.421299 -6.270070 EU Dublin ENG D GB IE International 51563063.0 United Kingdom London 10952 0.021240 GBR Europe & Central Asia High income 4.134715e+07 0.026488
105493 985087 QPH SZX 03f7e1c704f178d3150aa48f972569f0141aabda512e4c... 6ea40289cb286dbcfe2d20d84ef312dd51425df2984f9b... round_trip 2018-04-26 00:00:00 2018-05-04 00:00:00 8.0 0 0 0 0 NoFilter 1 2018-01-01 13:05:00 NaN 2018-01-01 13:05:00 shopped 0.0 0.0 buy 795.0 buy 795.0 2018-01-01 13:05:00 795.0 795.0 13 gained 1 1 0 0 days 00:00:00.000000000 search 0.0 114 -22.566999 27.150000 AF Palapye 22.639299 113.810997 AS Shenzhen CE 44 BW CN International 1270186.0 Botswana Gaborone 9821 0.773194 BWA Sub-Saharan Africa Upper middle income 6.335425e+04 15.501722
721156 568203 ORD PHX e58dbac77fa8861a117496e5505b114d65773ed693640a... 6c210d671510225fc3ac6818600f0046a2d541281c5ae4... round_trip 2018-02-16 00:00:00 2018-02-20 00:00:00 4.0 0 0 0 0 NoFilter 1 2018-01-09 16:26:00 NaN 2018-01-09 16:26:00 shopped 0.0 0.0 buy 337.0 buy 337.0 2018-01-09 16:26:00 337.0 337.0 19 gained 1 1 0 0 days 00:00:00.000000000 search 0.0 37 41.978600 -87.904800 NorthA Chicago 33.434299 -112.012001 NorthA Phoenix IL AZ US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
751511 852209 QPH SAN f0c1d358c0e29a673a279f8294c9dc5c85b896a5a7876f... cc44fc649ecd3af9c177a99e21f6a6e8053fd3cd0c1c1c... round_trip 2018-05-12 00:00:00 2018-05-19 00:00:00 7.0 0 0 0 0 NoFilter 10 2018-01-30 18:02:00 2018-01-30 18:02:00 2018-02-19 15:35:00 active 13.0 7.0 wait 297.0 buy 229.0 2018-03-07 16:48:00 226.0 131.0 2 expected 1 1 0 0 days 00:00:00.000000000 watch 95.0 101 -22.566999 27.150000 AF Palapye 32.733601 -117.190002 NorthA San Diego CE CA BW US International 1270186.0 Botswana Gaborone 9821 0.773194 BWA Sub-Saharan Africa Upper middle income 6.335425e+04 15.501722
199309 739873 OAK DCA 2591047d624ffb767a6e0143f597f33e9e65bc6a329c5b... f7478c81dc558ce2bbb6475672427b4db440deefa5f052... round_trip 2018-03-23 00:00:00 2018-03-25 00:00:00 2.0 1 0 0 0 NoFilter 1 2018-02-06 13:11:00 NaN 2018-02-06 13:11:00 shopped 0.0 0.0 buy 539.0 buy 539.0 2018-02-06 13:11:00 539.0 539.0 41 gained 13 5 7 7 days 13:58:00.000000000 search 0.0 44 37.721298 -122.221001 NorthA Oakland 38.852100 -77.037697 NorthA Washington CA DC US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
495114 65570 BGI FDF 92304efa41b145fb97c39586513a62c9cdb8def46d8445... 84a1f441dbce58eeb09edad33318ef04363690a7a83ea5... one_way 2018-03-15 00:00:00 NaN NaN 0 0 0 0 NoFilter 1 2018-02-23 23:30:00 NaN 2018-02-23 23:30:00 shopped 0.0 0.0 buy 181.0 buy 181.0 2018-02-23 23:30:00 181.0 181.0 6 gained 2 1 0 0 days 00:04:00.000000000 search 0.0 19 13.074600 -59.492500 NorthA Bridgetown 14.591000 -61.003201 NorthA Fort-de-France 01 U-A BB MQ International 217880.0 Barbados Bridgetown 172 0.078943 BRB Latin America & Caribbean High income NaN NaN
50355 595692 ICN JFK 8bfd9a871faf17f9995c52cfb06a812c827afd7d75e7b3... 156b1dc84cb35ae762cf3220c78243306aa2411503ec05... round_trip 2018-04-01 00:00:00 2018-04-08 00:00:00 7.0 0 0 0 0 NoFilter 1 2018-02-16 08:28:00 NaN 2018-02-16 08:28:00 shopped 0.0 0.0 wait 867.0 wait 867.0 NaN NaN 867.0 1 gained 1 1 0 0 days 00:00:00.000000000 search NaN 43 37.469101 126.450996 AS Seoul 40.639801 -73.778900 NorthA New York 28 NY KR US International 41781113.0 South Korea Seoul 1038 0.002484 KOR East Asia & Pacific High income 2.203939e+07 0.004710
375599 937754 SEA SNA 65f5223957c3530eb458c38ccab975772b5868a992b398... f0977f5a55a33fbf9f6e7c4aa2389846c4def81f010cce... round_trip 2018-04-04 00:00:00 2018-04-10 00:00:00 6.0 0 0 0 0 NoFilter 1 2018-02-23 02:16:00 NaN 2018-02-23 02:16:00 shopped 0.0 0.0 wait 193.0 wait 193.0 NaN NaN 193.0 7 gained 2 2 37 37 days 13:41:00.000000000 search NaN 39 47.449001 -122.308998 NorthA Seattle 33.675701 -117.867996 NorthA Santa Ana WA CA US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
245788 704702 COS DCA 36c79a69dc84e795f59d95a5fe544c06844c95dedb9ed6... 0a0d2ce155e90bb079ca2bb3afa55fac8d7fc7e78c3689... round_trip 2018-03-29 00:00:00 2018-04-02 00:00:00 4.0 1 0 0 0 NoFilter 1 2018-01-16 10:47:00 NaN 2018-01-16 10:47:00 shopped 0.0 0.0 wait 338.0 wait 338.0 NaN NaN 338.0 8 gained 6 3 1 1 days 00:05:00.000000000 search NaN 71 38.805801 -104.700996 NorthA Colorado Springs 38.852100 -77.037697 NorthA Washington CO DC US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
126578 724935 SLC CUN 0b467a9ea3e4728b3735abade411bdebd151806b1dde04... 0acee089116641ff3b2e8723381d908e06348cc8cf8f90... round_trip 2018-02-23 00:00:00 2018-02-28 00:00:00 5.0 0 0 0 0 NoFilter 1 2018-01-12 13:31:00 NaN 2018-01-12 13:31:00 shopped 0.0 0.0 buy 288.0 buy 288.0 2018-01-12 13:31:00 288.0 288.0 4 gained 2 2 6 6 days 18:58:00.000000000 search 0.0 41 40.788399 -111.977997 NorthA Salt Lake City 21.036501 -86.877098 NorthA Cancún UT ROO US MX International 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
61894 711066 SCL LIM 5ed119f12791922b3f5708a053823c217119277608ca46... 7f999aa50fed2147ad29321bd51c640e099beba93cebec... round_trip 2018-06-27 00:00:00 2018-07-16 00:00:00 19.0 0 0 1 0 NonStop 2 2018-01-29 10:33:00 2018-01-29 10:33:00 2018-01-29 10:33:00 active 10.0 2.0 wait 299.0 wait 346.0 2018-02-15 18:28:00 233.0 233.0 1 expected 1 1 0 0 days 00:00:00.000000000 watch 0.0 148 -33.393002 -70.785797 SA Santiago -12.021900 -77.114305 SA Lima RM LIM CL PE International 13749004.0 Chile Santiago 2199 0.015994 CHL Latin America & Caribbean High income 4.879296e+06 0.045068
422455 869561 SJU EWR 775dafd552372c2811fcdbe0c9740cf0bf8dda64a11a9c... 347c10a47b6eb40a41a011d76e45a1e57f347c8b31feb1... one_way 2018-02-26 00:00:00 NaN NaN 0 0 0 0 NoFilter 1 2018-01-10 09:27:00 NaN 2018-01-10 09:27:00 shopped 0.0 0.0 buy 124.0 buy 148.0 2018-01-10 09:27:00 124.0 124.0 24 gained 10 2 0 0 days 00:28:00.000000000 search 0.0 46 18.439400 -66.001801 NorthA San Juan 40.692501 -74.168701 NorthA New York U-A NJ PR US International 2334548.0 Puerto Rico San Juan 12666 0.542546 PRI Latin America & Caribbean High income NaN NaN
84594 919537 PTY BOG 31f20bb021373c92c90997d9c98a76c5f3399acdc87f86... 5dffb18eeb9158745bd011b50bf637e1991b881c020144... round_trip 2018-02-08 00:00:00 2018-02-18 00:00:00 10.0 0 0 0 0 NoFilter 1 2018-01-23 11:58:00 NaN 2018-01-23 11:58:00 shopped 0.0 0.0 buy 215.0 buy 215.0 2018-01-23 11:58:00 215.0 215.0 1 gained 1 1 0 0 days 00:00:00.000000000 search 0.0 15 9.071360 -79.383499 NorthA Tocumen 4.701590 -74.146900 SA Bogota 8 CUN PA CO International 2694239.0 Panama Panama City 2560 0.095018 PAN Latin America & Caribbean High income 3.234838e+06 0.079138
286026 696990 ATL MIA 45c561e2d1d0d7239d9db58313cc842999089601d86d35... 0d96cd139ff0f66faf8b26931daa96e54d3d4a9cf77357... round_trip 2018-03-07 00:00:00 2018-03-13 00:00:00 6.0 0 0 0 0 NoFilter 1 2018-01-02 04:45:00 NaN 2018-01-02 04:45:00 shopped 0.0 0.0 buy 115.0 buy 115.0 2018-01-02 04:45:00 115.0 115.0 4 gained 1 1 0 0 days 00:00:00.000000000 search 0.0 63 33.636700 -84.428101 NorthA Atlanta 25.793200 -80.290604 NorthA Miami GA FL US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
56160 654275 SLC SEA d9fc70558ab7e99873b614a8b37005340d123d608f95d2... 1fa888bfefc52cd18ee170366b6c3b69aafa4844f6d52b... round_trip 2018-02-17 00:00:00 2018-02-24 00:00:00 7.0 0 0 1 0 NonStop 3 2018-01-17 18:56:00 2018-01-17 18:57:00 2018-02-11 09:25:00 inactive 5.0 5.0 buy 262.0 buy 647.0 2018-01-17 18:56:00 262.0 202.0 1 lost 1 1 0 0 days 00:00:00.000000000 watch 60.0 30 40.788399 -111.977997 NorthA Salt Lake City 47.449001 -122.308998 NorthA Seattle UT WA US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
636635 360338 YQU HNL c68053db0cefc1bfcd5bb17d4ee0927aace29436c2a486... 0dd7f18c57251dfbb665f62f7bfafc679995e5f58b1d59... round_trip 2018-02-19 00:00:00 2018-03-01 00:00:00 10.0 0 0 0 0 NoFilter 1 2018-01-09 10:19:00 NaN 2018-01-09 10:19:00 shopped 0.0 0.0 wait 1029.0 wait 1029.0 NaN NaN 1029.0 3 gained 1 1 0 0 days 00:00:00.000000000 search NaN 40 55.179699 -118.885002 NorthA Grande Prairie 21.320620 -157.924228 NorthA Honolulu AB HI CA US International 29156938.0 Canada Ottawa 50409 0.172889 CAN North America High income 2.234500e+07 0.225594
171754 785709 MCO DCA 1b86c9e0e6864815ca7ddf25b5876ddee1bda26bbab86e... 76e4b1ac9947989862b523b1f27b7d486914f4315ea76b... round_trip 2018-03-30 00:00:00 2018-04-01 00:00:00 2.0 1 0 0 0 NoFilter 3 2018-03-20 18:43:00 2018-03-20 18:44:00 2018-03-28 23:10:00 inactive 11.0 11.0 buy 232.0 buy 237.0 2018-03-20 18:43:00 232.0 232.0 9 lost 4 2 54 54 days 02:18:00.000000000 watch 0.0 9 28.429399 -81.308998 NorthA Orlando 38.852100 -77.037697 NorthA Washington FL DC US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
143497 724118 HNL LAX 116cc43b7980f50a9a8c33f75edfb0781d3dbbf7163f26... a426dd93d10661651e3ad11edfec987a10fc63e4045b6f... round_trip 2018-04-10 00:00:00 2018-04-18 00:00:00 8.0 0 0 0 0 NoFilter 3 2018-01-08 16:22:00 2018-01-08 16:22:00 2018-03-07 11:20:00 inactive 13.0 8.0 wait 597.0 buy 502.0 2018-01-10 01:31:00 504.0 456.0 4 lost 2 1 0 0 days 00:03:00.000000000 watch 48.0 91 21.320620 -157.924228 NorthA Honolulu 33.942501 -118.407997 NorthA Los Angeles HI CA US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 2.222555e+08 0.263313
14684 193882 BRU ALC 719f38d90fe60162a921b6df8f026345d7171c10b216db... c68645efb3bd5147fcae9d94bc93320a250665f07cd345... round_trip 2018-04-24 00:00:00 2018-04-27 00:00:00 3.0 0 0 0 0 NoFilter 3 2018-03-25 07:15:00 2018-03-25 07:15:00 2018-03-25 07:15:00 inactive 0.0 0.0 buy 174.0 buy 174.0 2018-03-25 07:15:00 174.0 174.0 1 lost 1 1 0 0 days 00:00:00.000000000 watch 0.0 29 50.901402 4.484440 EU Brussels 38.282200 -0.558156 EU Alicante BRU V BE ES International 8891916.0 Belgium Brussels 1620 0.018219 BEL Europe & Central Asia High income 3.409872e+06 0.047509
In [9]:
df['is_session_1'] = df['session'] == 1
print("There are " + np.str(np.sum(df['is_session_1'])) + " entries of session 1")
df['is_US'] = df['Country name'] == 'United States'
print("There are " + np.str(np.sum(df['is_US'])) + " entries of US")
df['is_CA'] = df['Country name'] == 'Canada'
print("There are " + np.str(np.sum(df['is_CA'])) + " entries of CA")
There are 370413 entries of session 1
There are 585228 entries of US
There are 50409 entries of CA

Categorical variables

In [10]:
categorical = ['origin_city','destination_city','trip_type','weekend','filter_no_lcc','filter_non_stop',
                'filter_short_layover', 'filter_name','first_rec','last_rec','is_session_1',
                'Search or watch','Use frequency','continent_origin','City origin', 'City destination',
                'region_origin','region_destination','country_origin','continent_destination',
                'country_destination','Domestic or international','Region','IncomeGroup','outcome']
In [11]:
df[categorical].dtypes
Out[11]:
origin_city                  object
destination_city             object
trip_type                    object
weekend                       int64
filter_no_lcc                 int64
filter_non_stop               int64
filter_short_layover          int64
filter_name                  object
first_rec                    object
last_rec                     object
is_session_1                   bool
Search or watch              object
Use frequency                 int64
continent_origin             object
City origin                  object
City destination             object
region_origin                object
region_destination           object
country_origin               object
continent_destination        object
country_destination          object
Domestic or international    object
Region                       object
IncomeGroup                  object
outcome                      object
dtype: object

Make all NaN values in categorical data, a category itself.

In [12]:
df[categorical].dtypes.value_counts()
Out[12]:
object    19
int64      5
bool       1
dtype: int64
In [13]:
for c in categorical:
    df[c] = df[c].astype(str).astype('category')
   
df[categorical].dtypes
Out[13]:
origin_city                  category
destination_city             category
trip_type                    category
weekend                      category
filter_no_lcc                category
filter_non_stop              category
filter_short_layover         category
filter_name                  category
first_rec                    category
last_rec                     category
is_session_1                 category
Search or watch              category
Use frequency                category
continent_origin             category
City origin                  category
City destination             category
region_origin                category
region_destination           category
country_origin               category
continent_destination        category
country_destination          category
Domestic or international    category
Region                       category
IncomeGroup                  category
outcome                      category
dtype: object

Summarizing categorical variables against each other

In [14]:
categorical_correlations = pd.DataFrame(index=categorical, columns=categorical)
In [14]:
for i,j in combinations(categorical,2):
    p, effect = summarize_categorical(df,i,j)
    if p < .05:
        categorical_correlations[i][j] = effect

------------ origin_city and destination_city ------------

origin_city : 1324 unique values.
destination_city : 1583 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 17195192.92
D.f: 2092986


Effect size
Cramer's V: 0.11


------------ origin_city and trip_type ------------

origin_city : 1324 unique values.
trip_type : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 63454.83
D.f: 1323


Effect size
Cramer's V: 0.25


------------ origin_city and weekend ------------

origin_city : 1324 unique values.
weekend : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 35868.39
D.f: 1323


Effect size
Cramer's V: 0.19


------------ origin_city and filter_no_lcc ------------

origin_city : 1324 unique values.
filter_no_lcc : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3068.51
D.f: 1323


Effect size
Cramer's V: 0.06


------------ origin_city and filter_non_stop ------------

origin_city : 1324 unique values.
filter_non_stop : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 19925.34
D.f: 1323


Effect size
Cramer's V: 0.14


------------ origin_city and filter_short_layover ------------

origin_city : 1324 unique values.
filter_short_layover : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7965.37
D.f: 1323


Effect size
Cramer's V: 0.09


------------ origin_city and filter_name ------------

origin_city : 1324 unique values.
filter_name : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 32988.16
D.f: 6615


Effect size
Cramer's V: 0.08


------------ origin_city and first_rec ------------

origin_city : 1324 unique values.
first_rec : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 24478.84
D.f: 2646


Effect size
Cramer's V: 0.11


------------ origin_city and last_rec ------------

origin_city : 1324 unique values.
last_rec : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 19234.13
D.f: 2646


Effect size
Cramer's V: 0.1


------------ origin_city and is_session_1 ------------

origin_city : 1324 unique values.
is_session_1 : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 12806.05
D.f: 1323


Effect size
Cramer's V: 0.11


------------ origin_city and Search or watch ------------

origin_city : 1324 unique values.
Search or watch : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4972.08
D.f: 1323


Effect size
Cramer's V: 0.07


------------ origin_city and Use frequency ------------

origin_city : 1324 unique values.
Use frequency : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 8270.52
D.f: 1323


Effect size
Cramer's V: 0.09


------------ origin_city and continent_origin ------------

origin_city : 1324 unique values.
continent_origin : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5038460.0
D.f: 6615


Effect size
Cramer's V: 1.0


------------ origin_city and City origin ------------

origin_city : 1324 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1245507312.0
D.f: 1635228


Effect size
Cramer's V: 1.0


------------ origin_city and City destination ------------

origin_city : 1324 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 16623566.45
D.f: 1965978


Effect size
Cramer's V: 0.11


------------ origin_city and region_origin ------------

origin_city : 1324 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 455476784.0
D.f: 597996


Effect size
Cramer's V: 1.0


------------ origin_city and region_destination ------------

origin_city : 1324 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 6146998.61
D.f: 657531


Effect size
Cramer's V: 0.11


------------ origin_city and country_origin ------------

origin_city : 1324 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 188438404.0
D.f: 247401


Effect size
Cramer's V: 1.0


------------ origin_city and continent_destination ------------

origin_city : 1324 unique values.
continent_destination : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 471425.99
D.f: 6615


Effect size
Cramer's V: 0.31


------------ origin_city and country_destination ------------

origin_city : 1324 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2599910.26
D.f: 275184


Effect size
Cramer's V: 0.11


------------ origin_city and Domestic or international ------------

origin_city : 1324 unique values.
Domestic or international : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 309490.45
D.f: 1323


Effect size
Cramer's V: 0.55


------------ origin_city and Region ------------

origin_city : 1324 unique values.
Region : 8 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7053844.0
D.f: 9261


Effect size
Cramer's V: 1.0


------------ origin_city and IncomeGroup ------------

origin_city : 1324 unique values.
IncomeGroup : 5 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4030768.0
D.f: 5292


Effect size
Cramer's V: 1.0


------------ origin_city and outcome ------------

origin_city : 1324 unique values.
outcome : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 8044.5
D.f: 2646


Effect size
Cramer's V: 0.06


------------ destination_city and trip_type ------------

destination_city : 1583 unique values.
trip_type : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 36138.66
D.f: 1582


Effect size
Cramer's V: 0.19


------------ destination_city and weekend ------------

destination_city : 1583 unique values.
weekend : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 98031.79
D.f: 1582


Effect size
Cramer's V: 0.31


------------ destination_city and filter_no_lcc ------------

destination_city : 1583 unique values.
filter_no_lcc : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3899.21
D.f: 1582


Effect size
Cramer's V: 0.06


------------ destination_city and filter_non_stop ------------

destination_city : 1583 unique values.
filter_non_stop : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 19731.37
D.f: 1582


Effect size
Cramer's V: 0.14


------------ destination_city and filter_short_layover ------------

destination_city : 1583 unique values.
filter_short_layover : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 12278.92
D.f: 1582


Effect size
Cramer's V: 0.11


------------ destination_city and filter_name ------------

destination_city : 1583 unique values.
filter_name : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 37517.73
D.f: 7910


Effect size
Cramer's V: 0.09


------------ destination_city and first_rec ------------

destination_city : 1583 unique values.
first_rec : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 36545.71
D.f: 3164


Effect size
Cramer's V: 0.13


------------ destination_city and last_rec ------------

destination_city : 1583 unique values.
last_rec : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 28809.55
D.f: 3164


Effect size
Cramer's V: 0.12


------------ destination_city and is_session_1 ------------

destination_city : 1583 unique values.
is_session_1 : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7310.02
D.f: 1582


Effect size
Cramer's V: 0.09


------------ destination_city and Search or watch ------------

destination_city : 1583 unique values.
Search or watch : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7644.88
D.f: 1582


Effect size
Cramer's V: 0.09


------------ destination_city and Use frequency ------------

destination_city : 1583 unique values.
Use frequency : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5336.47
D.f: 1582


Effect size
Cramer's V: 0.07


------------ destination_city and continent_origin ------------

destination_city : 1583 unique values.
continent_origin : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 601510.69
D.f: 7910


Effect size
Cramer's V: 0.35


------------ destination_city and City origin ------------

destination_city : 1583 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 15049133.53
D.f: 1955352


Effect size
Cramer's V: 0.11


------------ destination_city and City destination ------------

destination_city : 1583 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1497430312.0
D.f: 2350852


Effect size
Cramer's V: 1.0


------------ destination_city and region_origin ------------

destination_city : 1583 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 6941738.56
D.f: 715064


Effect size
Cramer's V: 0.12


------------ destination_city and region_destination ------------

destination_city : 1583 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 500822924.0
D.f: 786254


Effect size
Cramer's V: 1.0


------------ destination_city and country_origin ------------

destination_city : 1583 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3322527.7
D.f: 295834


Effect size
Cramer's V: 0.13


------------ destination_city and continent_destination ------------

destination_city : 1583 unique values.
continent_destination : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5038460.0
D.f: 7910


Effect size
Cramer's V: 1.0


------------ destination_city and country_destination ------------

destination_city : 1583 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 209599936.0
D.f: 329056


Effect size
Cramer's V: 1.0


------------ destination_city and Domestic or international ------------

destination_city : 1583 unique values.
Domestic or international : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 639073.56
D.f: 1582


Effect size
Cramer's V: 0.8


------------ destination_city and Region ------------

destination_city : 1583 unique values.
Region : 8 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 621780.44
D.f: 11074


Effect size
Cramer's V: 0.3


------------ destination_city and IncomeGroup ------------

destination_city : 1583 unique values.
IncomeGroup : 5 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 175014.28
D.f: 6328


Effect size
Cramer's V: 0.21


------------ destination_city and outcome ------------

destination_city : 1583 unique values.
outcome : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 13412.18
D.f: 3164


Effect size
Cramer's V: 0.08


------------ trip_type and weekend ------------

trip_type : 2 unique values.
weekend : 2 unique values.
trip_type  one_way  round_trip      All
weekend                                
0           168772      639008   807780
1                0      199912   199912
All         168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 50169.58
D.f: 1


Effect size
Cramer's V: 0.22


------------ trip_type and filter_no_lcc ------------

trip_type : 2 unique values.
filter_no_lcc : 2 unique values.
trip_type      one_way  round_trip      All
filter_no_lcc                              
0               166696      827712   994408
1                 2076       11208    13284
All             168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0005
X^2: 12.04
D.f: 1


Effect size
Cramer's V: 0.0


------------ trip_type and filter_non_stop ------------

trip_type : 2 unique values.
filter_non_stop : 2 unique values.
trip_type        one_way  round_trip      All
filter_non_stop                              
0                 151320      752554   903874
1                  17452       86366   103818
All               168772      838920  1007692


X^2 TEST

p-value to reject null: 0.5763
X^2: 0.31
D.f: 1


Effect size
Cramer's V: 0.0


------------ trip_type and filter_short_layover ------------

trip_type : 2 unique values.
filter_short_layover : 2 unique values.
trip_type             one_way  round_trip      All
filter_short_layover                              
0                      163829      813095   976924
1                        4943       25825    30768
All                    168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0012
X^2: 10.57
D.f: 1


Effect size
Cramer's V: 0.0


------------ trip_type and filter_name ------------

trip_type : 2 unique values.
filter_name : 6 unique values.
trip_type                one_way  round_trip      All
filter_name                                          
And(NonStop,NoLCC)          1030        5218     6248
And(ShortLayover,NoLCC)      383        1838     2221
NoFilter                  145714      722577   868291
NoLCC                        663        4152     4815
NonStop                    16422       81148    97570
ShortLayover                4560       23987    28547
All                       168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 44.8
D.f: 5


Effect size
Cramer's V: 0.01


------------ trip_type and first_rec ------------

trip_type : 2 unique values.
first_rec : 3 unique values.
trip_type  one_way  round_trip      All
first_rec                              
buy          87405      420652   508057
nan          10126       48037    58163
wait         71241      370231   441472
All         168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 212.1
D.f: 2


Effect size
Cramer's V: 0.01


------------ trip_type and last_rec ------------

trip_type : 2 unique values.
last_rec : 3 unique values.
trip_type  one_way  round_trip      All
last_rec                               
buy          93766      457818   551584
nan          10126       48037    58163
wait         64880      333065   397945
All         168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 99.59
D.f: 2


Effect size
Cramer's V: 0.01


------------ trip_type and is_session_1 ------------

trip_type : 2 unique values.
is_session_1 : 2 unique values.
trip_type     one_way  round_trip      All
is_session_1                              
False          100681      436527   537208
True            68091      402393   470484
All            168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 3277.96
D.f: 1


Effect size
Cramer's V: 0.06


------------ trip_type and Search or watch ------------

trip_type : 2 unique values.
Search or watch : 2 unique values.
trip_type        one_way  round_trip      All
Search or watch                              
search            112488      543962   656450
watch              56284      294958   351242
All               168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 202.67
D.f: 1


Effect size
Cramer's V: 0.01


------------ trip_type and Use frequency ------------

trip_type : 2 unique values.
Use frequency : 2 unique values.
trip_type       one_way  round_trip      All
Use frequency                               
more than once   160670      774176   934846
once               8102       64744    72846
All              168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1782.23
D.f: 1


Effect size
Cramer's V: 0.04


------------ trip_type and continent_origin ------------

trip_type : 2 unique values.
continent_origin : 6 unique values.
trip_type         one_way  round_trip      All
continent_origin                              
AF                   2746       13781    16527
AS                   9176       20647    29823
EU                  18452       41096    59548
NorthA             131286      728017   859303
OC                   2064        5311     7375
SA                   5048       30068    35116
All                168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 15004.31
D.f: 5


Effect size
Cramer's V: 0.12


------------ trip_type and City origin ------------

trip_type : 2 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 62089.14
D.f: 1236


Effect size
Cramer's V: 0.25


------------ trip_type and City destination ------------

trip_type : 2 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 34557.81
D.f: 1486


Effect size
Cramer's V: 0.19


------------ trip_type and region_origin ------------

trip_type : 2 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 50249.05
D.f: 452


Effect size
Cramer's V: 0.22


------------ trip_type and region_destination ------------

trip_type : 2 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 23830.36
D.f: 497


Effect size
Cramer's V: 0.15


------------ trip_type and country_origin ------------

trip_type : 2 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 33086.84
D.f: 187


Effect size
Cramer's V: 0.18


------------ trip_type and continent_destination ------------

trip_type : 2 unique values.
continent_destination : 6 unique values.
trip_type              one_way  round_trip      All
continent_destination                              
AF                        2589       12203    14792
AS                       10942       70998    81940
EU                       23749      121176   144925
NorthA                  124350      595385   719735
OC                        2651        9522    12173
SA                        4491       29636    34127
All                     168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1377.22
D.f: 5


Effect size
Cramer's V: 0.04


------------ trip_type and country_destination ------------

trip_type : 2 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 13649.61
D.f: 208


Effect size
Cramer's V: 0.12


------------ trip_type and Domestic or international ------------

trip_type : 2 unique values.
Domestic or international : 2 unique values.
trip_type                  one_way  round_trip      All
Domestic or international                              
Domestic                     88508      428244   516752
International                80264      410676   490940
All                         168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 109.43
D.f: 1


Effect size
Cramer's V: 0.01


------------ trip_type and Region ------------

trip_type : 2 unique values.
Region : 8 unique values.
trip_type                   one_way  round_trip      All
Region                                                  
East Asia & Pacific            7840       15965    23805
Europe & Central Asia         18686       41584    60270
Latin America & Caribbean     16961       70081    87042
Middle East & North Africa     1851        6807     8658
North America                119331      687931   807262
South Asia                     1511        2691     4202
Sub-Saharan Africa             2375       13289    15664
nan                             217         572      789
All                          168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 17301.96
D.f: 7


Effect size
Cramer's V: 0.13


------------ trip_type and IncomeGroup ------------

trip_type : 2 unique values.
IncomeGroup : 5 unique values.
trip_type            one_way  round_trip      All
IncomeGroup                                      
High income           147720      762929   910649
Low income               356         336      692
Lower middle income     4972       10995    15967
Upper middle income    15507       64088    79595
nan                      217         572      789
All                   168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 3642.53
D.f: 4


Effect size
Cramer's V: 0.06


------------ trip_type and outcome ------------

trip_type : 2 unique values.
outcome : 3 unique values.
trip_type  one_way  round_trip      All
outcome                                
expected     18788      117181   135969
gained      113299      545831   659130
lost         36685      175908   212593
All         168772      838920  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 968.6
D.f: 2


Effect size
Cramer's V: 0.03


------------ weekend and filter_no_lcc ------------

weekend : 2 unique values.
filter_no_lcc : 2 unique values.
weekend             0       1      All
filter_no_lcc                         
0              797592  196816   994408
1               10188    3096    13284
All            807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 101.56
D.f: 1


Effect size
Cramer's V: 0.01


------------ weekend and filter_non_stop ------------

weekend : 2 unique values.
filter_non_stop : 2 unique values.
weekend               0       1      All
filter_non_stop                         
0                730919  172955   903874
1                 76861   26957   103818
All              807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2731.79
D.f: 1


Effect size
Cramer's V: 0.05


------------ weekend and filter_short_layover ------------

weekend : 2 unique values.
filter_short_layover : 2 unique values.
weekend                    0       1      All
filter_short_layover                         
0                     781189  195735   976924
1                      26591    4177    30768
All                   807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 782.35
D.f: 1


Effect size
Cramer's V: 0.03


------------ weekend and filter_name ------------

weekend : 2 unique values.
filter_name : 6 unique values.
weekend                       0       1      All
filter_name                                     
And(NonStop,NoLCC)         4459    1789     6248
And(ShortLayover,NoLCC)    1929     292     2221
NoFilter                 700528  167763   868291
NoLCC                      3800    1015     4815
NonStop                   72402   25168    97570
ShortLayover              24662    3885    28547
All                      807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 3390.53
D.f: 5


Effect size
Cramer's V: 0.06


------------ weekend and first_rec ------------

weekend : 2 unique values.
first_rec : 3 unique values.
weekend         0       1      All
first_rec                         
buy        413500   94557   508057
nan         47178   10985    58163
wait       347102   94370   441472
All        807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1170.54
D.f: 2


Effect size
Cramer's V: 0.03


------------ weekend and last_rec ------------

weekend : 2 unique values.
last_rec : 3 unique values.
weekend        0       1      All
last_rec                         
buy       443205  108379   551584
nan        47178   10985    58163
wait      317397   80548   397945
All       807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 86.18
D.f: 2


Effect size
Cramer's V: 0.01


------------ weekend and is_session_1 ------------

weekend : 2 unique values.
is_session_1 : 2 unique values.
weekend            0       1      All
is_session_1                         
False         429789  107419   537208
True          377991   92493   470484
All           807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 17.86
D.f: 1


Effect size
Cramer's V: 0.0


------------ weekend and Search or watch ------------

weekend : 2 unique values.
Search or watch : 2 unique values.
weekend               0       1      All
Search or watch                         
search           535289  121161   656450
watch            272491   78751   351242
All              807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2260.28
D.f: 1


Effect size
Cramer's V: 0.05


------------ weekend and Use frequency ------------

weekend : 2 unique values.
Use frequency : 2 unique values.
weekend              0       1      All
Use frequency                          
more than once  750034  184812   934846
once             57746   15100    72846
All             807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 39.06
D.f: 1


Effect size
Cramer's V: 0.01


------------ weekend and continent_origin ------------

weekend : 2 unique values.
continent_origin : 6 unique values.
weekend                0       1      All
continent_origin                         
AF                 12810    3717    16527
AS                 28074    1749    29823
EU                 53276    6272    59548
NorthA            673872  185431   859303
OC                  6800     575     7375
SA                 32948    2168    35116
All               807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 13410.6
D.f: 5


Effect size
Cramer's V: 0.12


------------ weekend and City origin ------------

weekend : 2 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 33993.9
D.f: 1236


Effect size
Cramer's V: 0.18


------------ weekend and City destination ------------

weekend : 2 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 96642.12
D.f: 1486


Effect size
Cramer's V: 0.31


------------ weekend and region_origin ------------

weekend : 2 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 29300.47
D.f: 452


Effect size
Cramer's V: 0.17


------------ weekend and region_destination ------------

weekend : 2 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 87320.91
D.f: 497


Effect size
Cramer's V: 0.29


------------ weekend and country_origin ------------

weekend : 2 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 25459.28
D.f: 187


Effect size
Cramer's V: 0.16


------------ weekend and continent_destination ------------

weekend : 2 unique values.
continent_destination : 6 unique values.
weekend                     0       1      All
continent_destination                         
AF                      13234    1558    14792
AS                      79912    2028    81940
EU                     136057    8868   144925
NorthA                 535831  183904   719735
OC                      11580     593    12173
SA                      31166    2961    34127
All                    807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 52653.56
D.f: 5


Effect size
Cramer's V: 0.23


------------ weekend and country_destination ------------

weekend : 2 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 66628.21
D.f: 208


Effect size
Cramer's V: 0.26


------------ weekend and Domestic or international ------------

weekend : 2 unique values.
Domestic or international : 2 unique values.
weekend                         0       1      All
Domestic or international                         
Domestic                   359725  157027   516752
International              448055   42885   490940
All                        807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 74215.62
D.f: 1


Effect size
Cramer's V: 0.27


------------ weekend and Region ------------

weekend : 2 unique values.
Region : 8 unique values.
weekend                          0       1      All
Region                                             
East Asia & Pacific          22227    1578    23805
Europe & Central Asia        53944    6326    60270
Latin America & Caribbean    79350    7692    87042
Middle East & North Africa    8100     558     8658
North America               627365  179897   807262
South Asia                    4108      94     4202
Sub-Saharan Africa           11968    3696    15664
nan                            718      71      789
All                         807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 17574.69
D.f: 7


Effect size
Cramer's V: 0.13


------------ weekend and IncomeGroup ------------

weekend : 2 unique values.
IncomeGroup : 5 unique values.
weekend                   0       1      All
IncomeGroup                                 
High income          719813  190836   910649
Low income              681      11      692
Lower middle income   15331     636    15967
Upper middle income   71237    8358    79595
nan                     718      71      789
All                  807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 7806.58
D.f: 4


Effect size
Cramer's V: 0.09


------------ weekend and outcome ------------

weekend : 2 unique values.
outcome : 3 unique values.
weekend        0       1      All
outcome                          
expected  107201   28768   135969
gained    537374  121756   659130
lost      163205   49388   212593
All       807780  199912  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2461.28
D.f: 2


Effect size
Cramer's V: 0.05


------------ filter_no_lcc and filter_non_stop ------------

filter_no_lcc : 2 unique values.
filter_non_stop : 2 unique values.
filter_no_lcc         0      1      All
filter_non_stop                        
0                896838   7036   903874
1                 97570   6248   103818
All              994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 19649.65
D.f: 1


Effect size
Cramer's V: 0.14


------------ filter_no_lcc and filter_short_layover ------------

filter_no_lcc : 2 unique values.
filter_short_layover : 2 unique values.
filter_no_lcc              0      1      All
filter_short_layover                        
0                     965861  11063   976924
1                      28547   2221    30768
All                   994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 8488.56
D.f: 1


Effect size
Cramer's V: 0.09


------------ filter_no_lcc and filter_name ------------

filter_no_lcc : 2 unique values.
filter_name : 6 unique values.
filter_no_lcc                 0      1      All
filter_name                                    
And(NonStop,NoLCC)            0   6248     6248
And(ShortLayover,NoLCC)       0   2221     2221
NoFilter                 868291      0   868291
NoLCC                         0   4815     4815
NonStop                   97570      0    97570
ShortLayover              28547      0    28547
All                      994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1007692.0
D.f: 5


Effect size
Cramer's V: 1.0


------------ filter_no_lcc and first_rec ------------

filter_no_lcc : 2 unique values.
first_rec : 3 unique values.
filter_no_lcc       0      1      All
first_rec                            
buy            503667   4390   508057
nan             57523    640    58163
wait           433218   8254   441472
All            994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1858.65
D.f: 2


Effect size
Cramer's V: 0.04


------------ filter_no_lcc and last_rec ------------

filter_no_lcc : 2 unique values.
last_rec : 3 unique values.
filter_no_lcc       0      1      All
last_rec                             
buy            546315   5269   551584
nan             57523    640    58163
wait           390570   7375   397945
All            994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1455.58
D.f: 2


Effect size
Cramer's V: 0.04


------------ filter_no_lcc and is_session_1 ------------

filter_no_lcc : 2 unique values.
is_session_1 : 2 unique values.
filter_no_lcc       0      1      All
is_session_1                         
False          529927   7281   537208
True           464481   6003   470484
All            994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0005
X^2: 12.1
D.f: 1


Effect size
Cramer's V: 0.0


------------ filter_no_lcc and Search or watch ------------

filter_no_lcc : 2 unique values.
Search or watch : 2 unique values.
filter_no_lcc         0      1      All
Search or watch                        
search           650508   5942   656450
watch            343900   7342   351242
All              994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2469.51
D.f: 1


Effect size
Cramer's V: 0.05


------------ filter_no_lcc and Use frequency ------------

filter_no_lcc : 2 unique values.
Use frequency : 2 unique values.
filter_no_lcc        0      1      All
Use frequency                         
more than once  922212  12634   934846
once             72196    650    72846
All             994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 109.17
D.f: 1


Effect size
Cramer's V: 0.01


------------ filter_no_lcc and continent_origin ------------

filter_no_lcc : 2 unique values.
continent_origin : 6 unique values.
filter_no_lcc          0      1      All
continent_origin                        
AF                 16242    285    16527
AS                 29553    270    29823
EU                 58765    783    59548
NorthA            847638  11665   859303
OC                  7267    108     7375
SA                 34943    173    35116
All               994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 255.43
D.f: 5


Effect size
Cramer's V: 0.02


------------ filter_no_lcc and City origin ------------

filter_no_lcc : 2 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2826.64
D.f: 1236


Effect size
Cramer's V: 0.05


------------ filter_no_lcc and City destination ------------

filter_no_lcc : 2 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3741.93
D.f: 1486


Effect size
Cramer's V: 0.06


------------ filter_no_lcc and region_origin ------------

filter_no_lcc : 2 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1897.52
D.f: 452


Effect size
Cramer's V: 0.04


------------ filter_no_lcc and region_destination ------------

filter_no_lcc : 2 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2458.01
D.f: 497


Effect size
Cramer's V: 0.05


------------ filter_no_lcc and country_origin ------------

filter_no_lcc : 2 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1205.64
D.f: 187


Effect size
Cramer's V: 0.03


------------ filter_no_lcc and continent_destination ------------

filter_no_lcc : 2 unique values.
continent_destination : 6 unique values.
filter_no_lcc               0      1      All
continent_destination                        
AF                      14652    140    14792
AS                      81365    575    81940
EU                     143204   1721   144925
NorthA                 709198  10537   719735
OC                      12031    142    12173
SA                      33958    169    34127
All                    994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 571.58
D.f: 5


Effect size
Cramer's V: 0.02


------------ filter_no_lcc and country_destination ------------

filter_no_lcc : 2 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1573.82
D.f: 208


Effect size
Cramer's V: 0.04


------------ filter_no_lcc and Domestic or international ------------

filter_no_lcc : 2 unique values.
Domestic or international : 2 unique values.
filter_no_lcc                   0      1      All
Domestic or international                        
Domestic                   508007   8745   516752
International              486401   4539   490940
All                        994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1140.14
D.f: 1


Effect size
Cramer's V: 0.03


------------ filter_no_lcc and Region ------------

filter_no_lcc : 2 unique values.
Region : 8 unique values.
filter_no_lcc                    0      1      All
Region                                            
East Asia & Pacific          23526    279    23805
Europe & Central Asia        59482    788    60270
Latin America & Caribbean    86457    585    87042
Middle East & North Africa    8594     64     8658
North America               796009  11253   807262
South Asia                    4178     24     4202
Sub-Saharan Africa           15387    277    15664
nan                            775     14      789
All                         994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 384.92
D.f: 7


Effect size
Cramer's V: 0.02


------------ filter_no_lcc and IncomeGroup ------------

filter_no_lcc : 2 unique values.
IncomeGroup : 5 unique values.
filter_no_lcc             0      1      All
IncomeGroup                                
High income          898161  12488   910649
Low income              687      5      692
Lower middle income   15857    110    15967
Upper middle income   78928    667    79595
nan                     775     14      789
All                  994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 212.61
D.f: 4


Effect size
Cramer's V: 0.01


------------ filter_no_lcc and outcome ------------

filter_no_lcc : 2 unique values.
outcome : 3 unique values.
filter_no_lcc       0      1      All
outcome                              
expected       133837   2132   135969
gained         653138   5992   659130
lost           207433   5160   212593
All            994408  13284  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2923.11
D.f: 2


Effect size
Cramer's V: 0.05


------------ filter_non_stop and filter_short_layover ------------

filter_non_stop : 2 unique values.
filter_short_layover : 2 unique values.
filter_non_stop            0       1      All
filter_short_layover                         
0                     873106  103818   976924
1                      30768       0    30768
All                   903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 3644.13
D.f: 1


Effect size
Cramer's V: 0.06


------------ filter_non_stop and filter_name ------------

filter_non_stop : 2 unique values.
filter_name : 6 unique values.
filter_non_stop               0       1      All
filter_name                                     
And(NonStop,NoLCC)            0    6248     6248
And(ShortLayover,NoLCC)    2221       0     2221
NoFilter                 868291       0   868291
NoLCC                      4815       0     4815
NonStop                       0   97570    97570
ShortLayover              28547       0    28547
All                      903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1007692.0
D.f: 5


Effect size
Cramer's V: 1.0


------------ filter_non_stop and first_rec ------------

filter_non_stop : 2 unique values.
first_rec : 3 unique values.
filter_non_stop       0       1      All
first_rec                               
buy              462317   45740   508057
nan               53285    4878    58163
wait             388272   53200   441472
All              903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2619.35
D.f: 2


Effect size
Cramer's V: 0.05


------------ filter_non_stop and last_rec ------------

filter_non_stop : 2 unique values.
last_rec : 3 unique values.
filter_non_stop       0       1      All
last_rec                                
buy              497245   54339   551584
nan               53285    4878    58163
wait             353344   44601   397945
All              903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 705.37
D.f: 2


Effect size
Cramer's V: 0.03


------------ filter_non_stop and is_session_1 ------------

filter_non_stop : 2 unique values.
is_session_1 : 2 unique values.
filter_non_stop       0       1      All
is_session_1                            
False            483151   54057   537208
True             420723   49761   470484
All              903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 71.64
D.f: 1


Effect size
Cramer's V: 0.01


------------ filter_non_stop and Search or watch ------------

filter_non_stop : 2 unique values.
Search or watch : 2 unique values.
filter_non_stop       0       1      All
Search or watch                         
search           610300   46150   656450
watch            293574   57668   351242
All              903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 21821.67
D.f: 1


Effect size
Cramer's V: 0.15


------------ filter_non_stop and Use frequency ------------

filter_non_stop : 2 unique values.
Use frequency : 2 unique values.
filter_non_stop       0       1      All
Use frequency                           
more than once   837360   97486   934846
once              66514    6332    72846
All              903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 220.13
D.f: 1


Effect size
Cramer's V: 0.01


------------ filter_non_stop and continent_origin ------------

filter_non_stop : 2 unique values.
continent_origin : 6 unique values.
filter_non_stop        0       1      All
continent_origin                         
AF                 14311    2216    16527
AS                 27100    2723    29823
EU                 54194    5354    59548
NorthA            768105   91198   859303
OC                  6874     501     7375
SA                 33290    1826    35116
All               903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1504.99
D.f: 5


Effect size
Cramer's V: 0.04


------------ filter_non_stop and City origin ------------

filter_non_stop : 2 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 19141.85
D.f: 1236


Effect size
Cramer's V: 0.14


------------ filter_non_stop and City destination ------------

filter_non_stop : 2 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 18944.79
D.f: 1486


Effect size
Cramer's V: 0.14


------------ filter_non_stop and region_origin ------------

filter_non_stop : 2 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 13218.61
D.f: 452


Effect size
Cramer's V: 0.11


------------ filter_non_stop and region_destination ------------

filter_non_stop : 2 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 13335.57
D.f: 497


Effect size
Cramer's V: 0.12


------------ filter_non_stop and country_origin ------------

filter_non_stop : 2 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4061.28
D.f: 187


Effect size
Cramer's V: 0.06


------------ filter_non_stop and continent_destination ------------

filter_non_stop : 2 unique values.
continent_destination : 6 unique values.
filter_non_stop             0       1      All
continent_destination                         
AF                      13875     917    14792
AS                      77013    4927    81940
EU                     133273   11652   144925
NorthA                 635921   83814   719735
OC                      11552     621    12173
SA                      32240    1887    34127
All                    903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 5305.45
D.f: 5


Effect size
Cramer's V: 0.07


------------ filter_non_stop and country_destination ------------

filter_non_stop : 2 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 9090.51
D.f: 208


Effect size
Cramer's V: 0.09


------------ filter_non_stop and Domestic or international ------------

filter_non_stop : 2 unique values.
Domestic or international : 2 unique values.
filter_non_stop                 0       1      All
Domestic or international                         
Domestic                   452960   63792   516752
International              450914   40026   490940
All                        903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 4786.65
D.f: 1


Effect size
Cramer's V: 0.07


------------ filter_non_stop and Region ------------

filter_non_stop : 2 unique values.
Region : 8 unique values.
filter_non_stop                  0       1      All
Region                                             
East Asia & Pacific          21876    1929    23805
Europe & Central Asia        54819    5451    60270
Latin America & Caribbean    81322    5720    87042
Middle East & North Africa    7745     913     8658
North America               719959   87303   807262
South Asia                    3934     268     4202
Sub-Saharan Africa           13497    2167    15664
nan                            722      67      789
All                         903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2052.95
D.f: 7


Effect size
Cramer's V: 0.05


------------ filter_non_stop and IncomeGroup ------------

filter_non_stop : 2 unique values.
IncomeGroup : 5 unique values.
filter_non_stop           0       1      All
IncomeGroup                                 
High income          814061   96588   910649
Low income              631      61      692
Lower middle income   15038     929    15967
Upper middle income   73422    6173    79595
nan                     722      67      789
All                  903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1001.71
D.f: 4


Effect size
Cramer's V: 0.03


------------ filter_non_stop and outcome ------------

filter_non_stop : 2 unique values.
outcome : 3 unique values.
filter_non_stop       0       1      All
outcome                                 
expected         114521   21448   135969
gained           612378   46752   659130
lost             176975   35618   212593
All              903874  103818  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 21327.74
D.f: 2


Effect size
Cramer's V: 0.15


------------ filter_short_layover and filter_name ------------

filter_short_layover : 2 unique values.
filter_name : 6 unique values.
filter_short_layover          0      1      All
filter_name                                    
And(NonStop,NoLCC)         6248      0     6248
And(ShortLayover,NoLCC)       0   2221     2221
NoFilter                 868291      0   868291
NoLCC                      4815      0     4815
NonStop                   97570      0    97570
ShortLayover                  0  28547    28547
All                      976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1007692.0
D.f: 5


Effect size
Cramer's V: 1.0


------------ filter_short_layover and first_rec ------------

filter_short_layover : 2 unique values.
first_rec : 3 unique values.
filter_short_layover       0      1      All
first_rec                                   
buy                   493843  14214   508057
nan                    56755   1408    58163
wait                  426326  15146   441472
All                   976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 403.26
D.f: 2


Effect size
Cramer's V: 0.02


------------ filter_short_layover and last_rec ------------

filter_short_layover : 2 unique values.
last_rec : 3 unique values.
filter_short_layover       0      1      All
last_rec                                    
buy                   535643  15941   551584
nan                    56755   1408    58163
wait                  384526  13419   397945
All                   976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 264.89
D.f: 2


Effect size
Cramer's V: 0.02


------------ filter_short_layover and is_session_1 ------------

filter_short_layover : 2 unique values.
is_session_1 : 2 unique values.
filter_short_layover       0      1      All
is_session_1                                
False                 521846  15362   537208
True                  455078  15406   470484
All                   976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 145.72
D.f: 1


Effect size
Cramer's V: 0.01


------------ filter_short_layover and Search or watch ------------

filter_short_layover : 2 unique values.
Search or watch : 2 unique values.
filter_short_layover       0      1      All
Search or watch                             
search                643303  13147   656450
watch                 333621  17621   351242
All                   976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 7021.13
D.f: 1


Effect size
Cramer's V: 0.08


------------ filter_short_layover and Use frequency ------------

filter_short_layover : 2 unique values.
Use frequency : 2 unique values.
filter_short_layover       0      1      All
Use frequency                               
more than once        905973  28873   934846
once                   70951   1895    72846
All                   976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 54.02
D.f: 1


Effect size
Cramer's V: 0.01


------------ filter_short_layover and continent_origin ------------

filter_short_layover : 2 unique values.
continent_origin : 6 unique values.
filter_short_layover       0      1      All
continent_origin                            
AF                     16101    426    16527
AS                     28816   1007    29823
EU                     57534   2014    59548
NorthA                833221  26082   859303
OC                      7048    327     7375
SA                     34204    912    35116
All                   976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 118.04
D.f: 5


Effect size
Cramer's V: 0.01


------------ filter_short_layover and City origin ------------

filter_short_layover : 2 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7511.6
D.f: 1236


Effect size
Cramer's V: 0.09


------------ filter_short_layover and City destination ------------

filter_short_layover : 2 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 11872.04
D.f: 1486


Effect size
Cramer's V: 0.11


------------ filter_short_layover and region_origin ------------

filter_short_layover : 2 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3419.67
D.f: 452


Effect size
Cramer's V: 0.06


------------ filter_short_layover and region_destination ------------

filter_short_layover : 2 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 6937.82
D.f: 497


Effect size
Cramer's V: 0.08


------------ filter_short_layover and country_origin ------------

filter_short_layover : 2 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1243.59
D.f: 187


Effect size
Cramer's V: 0.04


------------ filter_short_layover and continent_destination ------------

filter_short_layover : 2 unique values.
continent_destination : 6 unique values.
filter_short_layover        0      1      All
continent_destination                        
AF                      14206    586    14792
AS                      78490   3450    81940
EU                     139960   4965   144925
NorthA                 699787  19948   719735
OC                      11574    599    12173
SA                      32907   1220    34127
All                    976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 847.58
D.f: 5


Effect size
Cramer's V: 0.03


------------ filter_short_layover and country_destination ------------

filter_short_layover : 2 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4236.05
D.f: 208


Effect size
Cramer's V: 0.06


------------ filter_short_layover and Domestic or international ------------

filter_short_layover : 2 unique values.
Domestic or international : 2 unique values.
filter_short_layover            0      1      All
Domestic or international                        
Domestic                   505031  11721   516752
International              471893  19047   490940
All                        976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2208.15
D.f: 1


Effect size
Cramer's V: 0.05


------------ filter_short_layover and Region ------------

filter_short_layover : 2 unique values.
Region : 8 unique values.
filter_short_layover             0      1      All
Region                                            
East Asia & Pacific          23019    786    23805
Europe & Central Asia        58235   2035    60270
Latin America & Caribbean    84828   2214    87042
Middle East & North Africa    8399    259     8658
North America               782483  24779   807262
South Asia                    3944    258     4202
Sub-Saharan Africa           15254    410    15664
nan                            762     27      789
All                         976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 249.11
D.f: 7


Effect size
Cramer's V: 0.02


------------ filter_short_layover and IncomeGroup ------------

filter_short_layover : 2 unique values.
IncomeGroup : 5 unique values.
filter_short_layover       0      1      All
IncomeGroup                                 
High income           882632  28017   910649
Low income               664     28      692
Lower middle income    15382    585    15967
Upper middle income    77484   2111    79595
nan                      762     27      789
All                   976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 67.71
D.f: 4


Effect size
Cramer's V: 0.01


------------ filter_short_layover and outcome ------------

filter_short_layover : 2 unique values.
outcome : 3 unique values.
filter_short_layover       0      1      All
outcome                                     
expected              129198   6771   135969
gained                645837  13293   659130
lost                  201889  10704   212593
All                   976924  30768  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 6917.68
D.f: 2


Effect size
Cramer's V: 0.08


------------ filter_name and first_rec ------------

filter_name : 6 unique values.
first_rec : 3 unique values.
filter_name  And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  NoLCC  \
first_rec                                                                   
buy                        1788                      762    446263   1840   
nan                         268                       88     51593    284   
wait                       4192                     1371    370435   2691   
All                        6248                     2221    868291   4815   

filter_name  NonStop  ShortLayover      All  
first_rec                                    
buy            43952         13452   508057  
nan             4610          1320    58163  
wait           49008         13775   441472  
All            97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 4363.87
D.f: 10


Effect size
Cramer's V: 0.05


------------ filter_name and last_rec ------------

filter_name : 6 unique values.
last_rec : 3 unique values.
filter_name  And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  NoLCC  \
last_rec                                                                    
buy                        2384                      896    479315   1989   
nan                         268                       88     51593    284   
wait                       3596                     1237    337383   2542   
All                        6248                     2221    868291   4815   

filter_name  NonStop  ShortLayover      All  
last_rec                                     
buy            51955         15045   551584  
nan             4610          1320    58163  
wait           41005         12182   397945  
All            97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 2182.75
D.f: 10


Effect size
Cramer's V: 0.03


------------ filter_name and is_session_1 ------------

filter_name : 6 unique values.
is_session_1 : 2 unique values.
filter_name   And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  NoLCC  \
is_session_1                                                                 
False                       3494                     1204    465206   2583   
True                        2754                     1017    403085   2232   
All                         6248                     2221    868291   4815   

filter_name   NonStop  ShortLayover      All  
is_session_1                                  
False           50563         14158   537208  
True            47007         14389   470484  
All             97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 287.98
D.f: 5


Effect size
Cramer's V: 0.02


------------ filter_name and Search or watch ------------

filter_name : 6 unique values.
Search or watch : 2 unique values.
filter_name      And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  NoLCC  \
Search or watch                                                                 
search                         2189                      841    594241   2912   
watch                          4059                     1380    274050   1903   
All                            6248                     2221    868291   4815   

filter_name      NonStop  ShortLayover      All  
Search or watch                                  
search             43961         12306   656450  
watch              53609         16241   351242  
All                97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 30862.29
D.f: 5


Effect size
Cramer's V: 0.18


------------ filter_name and Use frequency ------------

filter_name : 6 unique values.
Use frequency : 2 unique values.
filter_name     And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  NoLCC  \
Use frequency                                                                  
more than once                5952                     2127    803932   4555   
once                           296                       94     64359    260   
All                           6248                     2221    868291   4815   

filter_name     NonStop  ShortLayover      All  
Use frequency                                   
more than once    91534         26746   934846  
once               6036          1801    72846  
All               97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 349.23
D.f: 5


Effect size
Cramer's V: 0.02


------------ filter_name and continent_origin ------------

filter_name : 6 unique values.
continent_origin : 6 unique values.
filter_name       And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  \
continent_origin                                                          
AF                               152                       27     13779   
AS                                99                       39     25961   
EU                               288                      204     51889   
NorthA                          5627                     1876    737861   
OC                                34                       30      6503   
SA                                48                       45     32298   
All                             6248                     2221    868291   

filter_name       NoLCC  NonStop  ShortLayover      All  
continent_origin                                         
AF                  106     2064           399    16527  
AS                  132     2624           968    29823  
EU                  291     5066          1810    59548  
NorthA             4162    85571         24206   859303  
OC                   44      467           297     7375  
SA                   80     1778           867    35116  
All                4815    97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 1807.63
D.f: 25


Effect size
Cramer's V: 0.02


------------ filter_name and City origin ------------

filter_name : 6 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 31279.43
D.f: 6180


Effect size
Cramer's V: 0.08


------------ filter_name and City destination ------------

filter_name : 6 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 35963.15
D.f: 7430


Effect size
Cramer's V: 0.08


------------ filter_name and region_origin ------------

filter_name : 6 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 19205.47
D.f: 2260


Effect size
Cramer's V: 0.06


------------ filter_name and region_destination ------------

filter_name : 6 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 23010.64
D.f: 2485


Effect size
Cramer's V: 0.07


------------ filter_name and country_origin ------------

filter_name : 6 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 6675.51
D.f: 935


Effect size
Cramer's V: 0.04


------------ filter_name and continent_destination ------------

filter_name : 6 unique values.
continent_destination : 6 unique values.
filter_name            And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  \
continent_destination                                                          
AF                                     50                       16     13215   
AS                                    140                      156     73284   
EU                                    645                      383    127615   
NorthA                               5343                     1551    612330   
OC                                     38                       37     10886   
SA                                     32                       78     30961   
All                                  6248                     2221    868291   

filter_name            NoLCC  NonStop  ShortLayover      All  
continent_destination                                         
AF                        74      867           570    14792  
AS                       279     4787          3294    81940  
EU                       693    11007          4582   144925  
NorthA                  3643    78471         18397   719735  
OC                        67      583           562    12173  
SA                        59     1855          1142    34127  
All                     4815    97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 6266.34
D.f: 25


Effect size
Cramer's V: 0.04


------------ filter_name and country_destination ------------

filter_name : 6 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 14879.42
D.f: 1040


Effect size
Cramer's V: 0.05


------------ filter_name and Domestic or international ------------

filter_name : 6 unique values.
Domestic or international : 2 unique values.
filter_name                And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  \
Domestic or international                                                
Domestic                                 4714                     1057   
International                            1534                     1164   
All                                      6248                     2221   

filter_name                NoFilter  NoLCC  NonStop  ShortLayover      All  
Domestic or international                                                   
Domestic                     438265   2974    59078         10664   516752  
International                430026   1841    38492         17883   490940  
All                          868291   4815    97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 7481.1
D.f: 5


Effect size
Cramer's V: 0.09


------------ filter_name and Region ------------

filter_name : 6 unique values.
Region : 8 unique values.
filter_name                 And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  \
Region                                                                    
East Asia & Pacific                        104                       47   
Europe & Central Asia                      292                      203   
Latin America & Caribbean                  201                      144   
Middle East & North Africa                  23                        7   
North America                             5474                     1777   
South Asia                                   4                       13   
Sub-Saharan Africa                         149                       27   
nan                                          1                        3   
All                                       6248                     2221   

filter_name                 NoFilter  NoLCC  NonStop  ShortLayover      All  
Region                                                                       
East Asia & Pacific            20962    128     1825           739    23805  
Europe & Central Asia          52491    293     5159          1832    60270  
Latin America & Caribbean      78868    240     5519          2070    87042  
Middle East & North Africa      7452     34      890           252     8658  
North America                 691178   4002    81829         23002   807262  
South Asia                      3669      7      264           245     4202  
Sub-Saharan Africa             12986    101     2018           383    15664  
nan                              685     10       66            24      789  
All                           868291   4815    97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 2585.99
D.f: 35


Effect size
Cramer's V: 0.02


------------ filter_name and IncomeGroup ------------

filter_name : 6 unique values.
IncomeGroup : 5 unique values.
filter_name          And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  \
IncomeGroup                                                                  
High income                        5948                     2060    781564   
Low income                            1                        2       601   
Lower middle income                  31                       18     14392   
Upper middle income                 267                      138     71049   
nan                                   1                        3       685   
All                                6248                     2221    868291   

filter_name          NoLCC  NonStop  ShortLayover      All  
IncomeGroup                                                 
High income           4480    90640         25957   910649  
Low income               2       60            26      692  
Lower middle income     61      898           567    15967  
Upper middle income    262     5906          1973    79595  
nan                     10       66            24      789  
All                   4815    97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 1206.94
D.f: 20


Effect size
Cramer's V: 0.02


------------ filter_name and outcome ------------

filter_name : 6 unique values.
outcome : 3 unique values.
filter_name  And(NonStop,NoLCC)  And(ShortLayover,NoLCC)  NoFilter  NoLCC  \
outcome                                                                     
expected                   1221                      465    107304    446   
gained                     2206                      857    596156   2929   
lost                       2821                      899    164831   1440   
All                        6248                     2221    868291   4815   

filter_name  NonStop  ShortLayover      All  
outcome                                      
expected       20227          6306   135969  
gained         44546         12436   659130  
lost           32797          9805   212593  
All            97570         28547  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 30681.31
D.f: 10


Effect size
Cramer's V: 0.12


------------ first_rec and last_rec ------------

first_rec : 3 unique values.
last_rec : 3 unique values.
first_rec     buy    nan    wait      All
last_rec                                 
buy        483188      0   68396   551584
nan             0  58163       0    58163
wait        24869      0  373076   397945
All        508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1660326.03
D.f: 4


Effect size
Cramer's V: 0.91


------------ first_rec and is_session_1 ------------

first_rec : 3 unique values.
is_session_1 : 2 unique values.
first_rec        buy    nan    wait      All
is_session_1                                
False         270248  37007  229953   537208
True          237809  21156  211519   470484
All           508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2754.73
D.f: 2


Effect size
Cramer's V: 0.05


------------ first_rec and Search or watch ------------

first_rec : 3 unique values.
Search or watch : 2 unique values.
first_rec           buy    nan    wait      All
Search or watch                                
search           358436  49929  248085   656450
watch            149621   8234  193387   351242
All              508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 33085.37
D.f: 2


Effect size
Cramer's V: 0.18


------------ first_rec and Use frequency ------------

first_rec : 3 unique values.
Use frequency : 2 unique values.
first_rec          buy    nan    wait      All
Use frequency                                 
more than once  470524  54557  409765   934846
once             37533   3606   31707    72846
All             508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 112.36
D.f: 2


Effect size
Cramer's V: 0.01


------------ first_rec and continent_origin ------------

first_rec : 3 unique values.
continent_origin : 6 unique values.
first_rec            buy    nan    wait      All
continent_origin                                
AF                  7808    917    7802    16527
AS                 18258   1868    9697    29823
EU                 35347   3122   21079    59548
NorthA            421576  50113  387614   859303
OC                  5343    359    1673     7375
SA                 19725   1784   13607    35116
All               508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 6170.56
D.f: 10


Effect size
Cramer's V: 0.06


------------ first_rec and City origin ------------

first_rec : 3 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 23529.67
D.f: 2472


Effect size
Cramer's V: 0.11


------------ first_rec and City destination ------------

first_rec : 3 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 35538.38
D.f: 2972


Effect size
Cramer's V: 0.13


------------ first_rec and region_origin ------------

first_rec : 3 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 15923.13
D.f: 904


Effect size
Cramer's V: 0.09


------------ first_rec and region_destination ------------

first_rec : 3 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 23043.34
D.f: 994


Effect size
Cramer's V: 0.11


------------ first_rec and country_origin ------------

first_rec : 3 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 10971.02
D.f: 374


Effect size
Cramer's V: 0.07


------------ first_rec and continent_destination ------------

first_rec : 3 unique values.
continent_destination : 6 unique values.
first_rec                 buy    nan    wait      All
continent_destination                                
AF                       8323    782    5687    14792
AS                      42990   4798   34152    81940
EU                      84523   8327   52075   144925
NorthA                 347614  41679  330442   719735
OC                       6860    681    4632    12173
SA                      17747   1896   14484    34127
All                    508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 5786.51
D.f: 10


Effect size
Cramer's V: 0.05


------------ first_rec and country_destination ------------

first_rec : 3 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 13368.17
D.f: 416


Effect size
Cramer's V: 0.08


------------ first_rec and Domestic or international ------------

first_rec : 3 unique values.
Domestic or international : 2 unique values.
first_rec                     buy    nan    wait      All
Domestic or international                                
Domestic                   248344  30229  238179   516752
International              259713  27934  203293   490940
All                        508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2442.16
D.f: 2


Effect size
Cramer's V: 0.05


------------ first_rec and Region ------------

first_rec : 3 unique values.
Region : 8 unique values.
first_rec                      buy    nan    wait      All
Region                                                    
East Asia & Pacific          15070   1429    7306    23805
Europe & Central Asia        35730   3161   21379    60270
Latin America & Caribbean    49365   4676   33001    87042
Middle East & North Africa    5253    549    2856     8658
North America               391877  47217  368168   807262
South Asia                    2984    232     986     4202
Sub-Saharan Africa            7310    859    7495    15664
nan                            468     40     281      789
All                         508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 7508.71
D.f: 14


Effect size
Cramer's V: 0.06


------------ first_rec and IncomeGroup ------------

first_rec : 3 unique values.
IncomeGroup : 5 unique values.
first_rec               buy    nan    wait      All
IncomeGroup                                        
High income          454380  52788  403481   910649
Low income              424     41     227      692
Lower middle income    9662    945    5360    15967
Upper middle income   43123   4349   32123    79595
nan                     468     40     281      789
All                  508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1320.24
D.f: 8


Effect size
Cramer's V: 0.03


------------ first_rec and outcome ------------

first_rec : 3 unique values.
outcome : 3 unique values.
first_rec     buy    nan    wait      All
outcome                                  
expected    48028   2725   85216   135969
gained     360065  49990  249075   659130
lost        99964   5448  107181   212593
All        508057  58163  441472  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 38452.58
D.f: 4


Effect size
Cramer's V: 0.14


------------ last_rec and is_session_1 ------------

last_rec : 3 unique values.
is_session_1 : 2 unique values.
last_rec         buy    nan    wait      All
is_session_1                                
False         287699  37007  212502   537208
True          263885  21156  185443   470484
All           551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2781.99
D.f: 2


Effect size
Cramer's V: 0.05


------------ last_rec and Search or watch ------------

last_rec : 3 unique values.
Search or watch : 2 unique values.
last_rec            buy    nan    wait      All
Search or watch                                
search           359156  49929  247365   656450
watch            192428   8234  150580   351242
All              551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 12535.07
D.f: 2


Effect size
Cramer's V: 0.11


------------ last_rec and Use frequency ------------

last_rec : 3 unique values.
Use frequency : 2 unique values.
last_rec           buy    nan    wait      All
Use frequency                                 
more than once  509313  54557  370976   934846
once             42271   3606   26969    72846
All             551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 368.38
D.f: 2


Effect size
Cramer's V: 0.02


------------ last_rec and continent_origin ------------

last_rec : 3 unique values.
continent_origin : 6 unique values.
last_rec             buy    nan    wait      All
continent_origin                                
AF                  8587    917    7023    16527
AS                 18874   1868    9081    29823
EU                 36165   3122   20261    59548
NorthA            461860  50113  347330   859303
OC                  5339    359    1677     7375
SA                 20759   1784   12573    35116
All               551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 3535.22
D.f: 10


Effect size
Cramer's V: 0.04


------------ last_rec and City origin ------------

last_rec : 3 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 18409.82
D.f: 2472


Effect size
Cramer's V: 0.1


------------ last_rec and City destination ------------

last_rec : 3 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 27990.5
D.f: 2972


Effect size
Cramer's V: 0.12


------------ last_rec and region_origin ------------

last_rec : 3 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 11703.89
D.f: 904


Effect size
Cramer's V: 0.08


------------ last_rec and region_destination ------------

last_rec : 3 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 16414.34
D.f: 994


Effect size
Cramer's V: 0.09


------------ last_rec and country_origin ------------

last_rec : 3 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7499.31
D.f: 374


Effect size
Cramer's V: 0.06


------------ last_rec and continent_destination ------------

last_rec : 3 unique values.
continent_destination : 6 unique values.
last_rec                  buy    nan    wait      All
continent_destination                                
AF                       8789    782    5221    14792
AS                      44276   4798   32866    81940
EU                      86410   8327   50188   144925
NorthA                 386060  41679  291996   719735
OC                       6988    681    4504    12173
SA                      19061   1896   13170    34127
All                    551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2071.24
D.f: 10


Effect size
Cramer's V: 0.03


------------ last_rec and country_destination ------------

last_rec : 3 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 8650.63
D.f: 416


Effect size
Cramer's V: 0.07


------------ last_rec and Domestic or international ------------

last_rec : 3 unique values.
Domestic or international : 2 unique values.
last_rec                      buy    nan    wait      All
Domestic or international                                
Domestic                   278530  30229  207993   516752
International              273054  27934  189952   490940
All                        551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 301.84
D.f: 2


Effect size
Cramer's V: 0.02


------------ last_rec and Region ------------

last_rec : 3 unique values.
Region : 8 unique values.
last_rec                       buy    nan    wait      All
Region                                                    
East Asia & Pacific          15311   1429    7065    23805
Europe & Central Asia        36565   3161   20544    60270
Latin America & Caribbean    52026   4676   30340    87042
Middle East & North Africa    5474    549    2635     8658
North America               430531  47217  329514   807262
South Asia                    3128    232     842     4202
Sub-Saharan Africa            8065    859    6740    15664
nan                            484     40     265      789
All                         551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 4500.01
D.f: 14


Effect size
Cramer's V: 0.05


------------ last_rec and IncomeGroup ------------

last_rec : 3 unique values.
IncomeGroup : 5 unique values.
last_rec                buy    nan    wait      All
IncomeGroup                                        
High income          494802  52788  363059   910649
Low income              450     41     201      692
Lower middle income   10130    945    4892    15967
Upper middle income   45718   4349   29528    79595
nan                     484     40     265      789
All                  551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 880.32
D.f: 8


Effect size
Cramer's V: 0.02


------------ last_rec and outcome ------------

last_rec : 3 unique values.
outcome : 3 unique values.
last_rec     buy    nan    wait      All
outcome                                 
expected   61426   2725   71818   135969
gained    361246  49990  247894   659130
lost      128912   5448   78233   212593
All       551584  58163  397945  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 21609.47
D.f: 4


Effect size
Cramer's V: 0.1


------------ is_session_1 and Search or watch ------------

is_session_1 : 2 unique values.
Search or watch : 2 unique values.
is_session_1      False    True      All
Search or watch                         
search           384170  272280   656450
watch            153038  198204   351242
All              537208  470484  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 20550.59
D.f: 1


Effect size
Cramer's V: 0.14


------------ is_session_1 and Use frequency ------------

is_session_1 : 2 unique values.
Use frequency : 2 unique values.
is_session_1     False    True      All
Use frequency                          
more than once  537208  397638   934846
once                 0   72846    72846
All             537208  470484  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 89656.11
D.f: 1


Effect size
Cramer's V: 0.3


------------ is_session_1 and continent_origin ------------

is_session_1 : 2 unique values.
continent_origin : 6 unique values.
is_session_1       False    True      All
continent_origin                         
AF                  8909    7618    16527
AS                 14677   15146    29823
EU                 28344   31204    59548
NorthA            466042  393261   859303
OC                  3108    4267     7375
SA                 16128   18988    35116
All               537208  470484  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2417.53
D.f: 5


Effect size
Cramer's V: 0.05


------------ is_session_1 and City origin ------------

is_session_1 : 2 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 12218.15
D.f: 1236


Effect size
Cramer's V: 0.11


------------ is_session_1 and City destination ------------

is_session_1 : 2 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 6663.13
D.f: 1486


Effect size
Cramer's V: 0.08


------------ is_session_1 and region_origin ------------

is_session_1 : 2 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 9036.59
D.f: 452


Effect size
Cramer's V: 0.09


------------ is_session_1 and region_destination ------------

is_session_1 : 2 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3947.76
D.f: 497


Effect size
Cramer's V: 0.06


------------ is_session_1 and country_origin ------------

is_session_1 : 2 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5814.61
D.f: 187


Effect size
Cramer's V: 0.08


------------ is_session_1 and continent_destination ------------

is_session_1 : 2 unique values.
continent_destination : 6 unique values.
is_session_1            False    True      All
continent_destination                         
AF                       7719    7073    14792
AS                      41975   39965    81940
EU                      75527   69398   144925
NorthA                 388201  331534   719735
OC                       6092    6081    12173
SA                      17694   16433    34127
All                    537208  470484  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 428.67
D.f: 5


Effect size
Cramer's V: 0.02


------------ is_session_1 and country_destination ------------

is_session_1 : 2 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2180.14
D.f: 208


Effect size
Cramer's V: 0.05


------------ is_session_1 and Domestic or international ------------

is_session_1 : 2 unique values.
Domestic or international : 2 unique values.
is_session_1                False    True      All
Domestic or international                         
Domestic                   278840  237912   516752
International              258368  232572   490940
All                        537208  470484  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 179.65
D.f: 1


Effect size
Cramer's V: 0.01


------------ is_session_1 and Region ------------

is_session_1 : 2 unique values.
Region : 8 unique values.
is_session_1                 False    True      All
Region                                             
East Asia & Pacific          11541   12264    23805
Europe & Central Asia        28661   31609    60270
Latin America & Caribbean    43681   43361    87042
Middle East & North Africa    4113    4545     8658
North America               438418  368844   807262
South Asia                    1936    2266     4202
Sub-Saharan Africa            8468    7196    15664
nan                            390     399      789
All                         537208  470484  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1904.69
D.f: 7


Effect size
Cramer's V: 0.04


------------ is_session_1 and IncomeGroup ------------

is_session_1 : 2 unique values.
IncomeGroup : 5 unique values.
is_session_1          False    True      All
IncomeGroup                                 
High income          488371  422278   910649
Low income              401     291      692
Lower middle income    8672    7295    15967
Upper middle income   39374   40221    79595
nan                     390     399      789
All                  537208  470484  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 526.45
D.f: 4


Effect size
Cramer's V: 0.02


------------ is_session_1 and outcome ------------

is_session_1 : 2 unique values.
outcome : 3 unique values.
is_session_1   False    True      All
outcome                              
expected       63587   72382   135969
gained        385611  273519   659130
lost           88010  124583   212593
All           537208  470484  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 21599.67
D.f: 2


Effect size
Cramer's V: 0.15


------------ Search or watch and Use frequency ------------

Search or watch : 2 unique values.
Use frequency : 2 unique values.
Search or watch  search   watch      All
Use frequency                           
more than once   618755  316091   934846
once              37695   35151    72846
All              656450  351242  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 6206.7
D.f: 1


Effect size
Cramer's V: 0.08


------------ Search or watch and continent_origin ------------

Search or watch : 2 unique values.
continent_origin : 6 unique values.
Search or watch   search   watch      All
continent_origin                         
AF                 10456    6071    16527
AS                 20661    9162    29823
EU                 37735   21813    59548
NorthA            559779  299524   859303
OC                  4718    2657     7375
SA                 23101   12015    35116
All               656450  351242  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 343.63
D.f: 5


Effect size
Cramer's V: 0.02


------------ Search or watch and City origin ------------

Search or watch : 2 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4252.77
D.f: 1236


Effect size
Cramer's V: 0.06


------------ Search or watch and City destination ------------

Search or watch : 2 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7138.07
D.f: 1486


Effect size
Cramer's V: 0.08


------------ Search or watch and region_origin ------------

Search or watch : 2 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2459.85
D.f: 452


Effect size
Cramer's V: 0.05


------------ Search or watch and region_destination ------------

Search or watch : 2 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4526.19
D.f: 497


Effect size
Cramer's V: 0.07


------------ Search or watch and country_origin ------------

Search or watch : 2 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1581.51
D.f: 187


Effect size
Cramer's V: 0.04


------------ Search or watch and continent_destination ------------

Search or watch : 2 unique values.
continent_destination : 6 unique values.
Search or watch        search   watch      All
continent_destination                         
AF                       9685    5107    14792
AS                      57538   24402    81940
EU                      96303   48622   144925
NorthA                 462078  257657   719735
OC                       8162    4011    12173
SA                      22684   11443    34127
All                    656450  351242  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1366.95
D.f: 5


Effect size
Cramer's V: 0.04


------------ Search or watch and country_destination ------------

Search or watch : 2 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3266.86
D.f: 208


Effect size
Cramer's V: 0.06


------------ Search or watch and Domestic or international ------------

Search or watch : 2 unique values.
Domestic or international : 2 unique values.
Search or watch            search   watch      All
Domestic or international                         
Domestic                   328611  188141   516752
International              327839  163101   490940
All                        656450  351242  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1125.43
D.f: 1


Effect size
Cramer's V: 0.03


------------ Search or watch and Region ------------

Search or watch : 2 unique values.
Region : 8 unique values.
Search or watch             search   watch      All
Region                                             
East Asia & Pacific          16016    7789    23805
Europe & Central Asia        38190   22080    60270
Latin America & Caribbean    57283   29759    87042
Middle East & North Africa    6150    2508     8658
North America               525521  281741   807262
South Asia                    2935    1267     4202
Sub-Saharan Africa            9825    5839    15664
nan                            530     259      789
All                         656450  351242  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 364.61
D.f: 7


Effect size
Cramer's V: 0.02


------------ Search or watch and IncomeGroup ------------

Search or watch : 2 unique values.
IncomeGroup : 5 unique values.
Search or watch      search   watch      All
IncomeGroup                                 
High income          592373  318276   910649
Low income              473     219      692
Lower middle income   11231    4736    15967
Upper middle income   51843   27752    79595
nan                     530     259      789
All                  656450  351242  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 197.91
D.f: 4


Effect size
Cramer's V: 0.01


------------ Search or watch and outcome ------------

Search or watch : 2 unique values.
outcome : 3 unique values.
Search or watch  search   watch      All
outcome                                 
expected              0  135969   135969
gained           656320    2810   659130
lost                130  212463   212593
All              656450  351242  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 994797.34
D.f: 2


Effect size
Cramer's V: 0.99


------------ Use frequency and continent_origin ------------

Use frequency : 2 unique values.
continent_origin : 6 unique values.
Use frequency     more than once   once      All
continent_origin                                
AF                         15353   1174    16527
AS                         27087   2736    29823
EU                         53770   5778    59548
NorthA                    800685  58618   859303
OC                          6538    837     7375
SA                         31413   3703    35116
All                       934846  72846  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1687.31
D.f: 5


Effect size
Cramer's V: 0.04


------------ Use frequency and City origin ------------

Use frequency : 2 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7880.69
D.f: 1236


Effect size
Cramer's V: 0.09


------------ Use frequency and City destination ------------

Use frequency : 2 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5023.61
D.f: 1486


Effect size
Cramer's V: 0.07


------------ Use frequency and region_origin ------------

Use frequency : 2 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5458.12
D.f: 452


Effect size
Cramer's V: 0.07


------------ Use frequency and region_destination ------------

Use frequency : 2 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2487.29
D.f: 497


Effect size
Cramer's V: 0.05


------------ Use frequency and country_origin ------------

Use frequency : 2 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3570.3
D.f: 187


Effect size
Cramer's V: 0.06


------------ Use frequency and continent_destination ------------

Use frequency : 2 unique values.
continent_destination : 6 unique values.
Use frequency          more than once   once      All
continent_destination                                
AF                              13604   1188    14792
AS                              75356   6584    81940
EU                             134517  10408   144925
NorthA                         669171  50564   719735
OC                              11124   1049    12173
SA                              31074   3053    34127
All                            934846  72846  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 323.6
D.f: 5


Effect size
Cramer's V: 0.02


------------ Use frequency and country_destination ------------

Use frequency : 2 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1346.82
D.f: 208


Effect size
Cramer's V: 0.04


------------ Use frequency and Domestic or international ------------

Use frequency : 2 unique values.
Domestic or international : 2 unique values.
Use frequency              more than once   once      All
Domestic or international                                
Domestic                           480010  36742   516752
International                      454836  36104   490940
All                                934846  72846  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 22.29
D.f: 1


Effect size
Cramer's V: 0.0


------------ Use frequency and Region ------------

Use frequency : 2 unique values.
Region : 8 unique values.
Use frequency               more than once   once      All
Region                                                    
East Asia & Pacific                  21599   2206    23805
Europe & Central Asia                54416   5854    60270
Latin America & Caribbean            79076   7966    87042
Middle East & North Africa            7797    861     8658
North America                       752910  54352   807262
South Asia                            3773    429     4202
Sub-Saharan Africa                   14557   1107    15664
nan                                    718     71      789
All                                 934846  72846  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 1633.26
D.f: 7


Effect size
Cramer's V: 0.04


------------ Use frequency and IncomeGroup ------------

Use frequency : 2 unique values.
IncomeGroup : 5 unique values.
Use frequency        more than once   once      All
IncomeGroup                                        
High income                  846629  64020   910649
Low income                      660     32      692
Lower middle income           14729   1238    15967
Upper middle income           72110   7485    79595
nan                             718     71      789
All                          934846  72846  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 632.31
D.f: 4


Effect size
Cramer's V: 0.03


------------ Use frequency and outcome ------------

Use frequency : 2 unique values.
outcome : 3 unique values.
Use frequency  more than once   once      All
outcome                                      
expected               119565  16404   135969
gained                 621259  37871   659130
lost                   194022  18571   212593
All                    934846  72846  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 7622.73
D.f: 2


Effect size
Cramer's V: 0.09


------------ continent_origin and City origin ------------

continent_origin : 6 unique values.
City origin : 1237 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4966131.02
D.f: 6180


Effect size
Cramer's V: 0.99


------------ continent_origin and City destination ------------

continent_origin : 6 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 584427.65
D.f: 7430


Effect size
Cramer's V: 0.34


------------ continent_origin and region_origin ------------

continent_origin : 6 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4339419.2
D.f: 2260


Effect size
Cramer's V: 0.93


------------ continent_origin and region_destination ------------

continent_origin : 6 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 434996.58
D.f: 2485


Effect size
Cramer's V: 0.29


------------ continent_origin and country_origin ------------

continent_origin : 6 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5036566.37
D.f: 935


Effect size
Cramer's V: 1.0


------------ continent_origin and continent_destination ------------

continent_origin : 6 unique values.
continent_destination : 6 unique values.
continent_origin          AF     AS     EU  NorthA    OC     SA      All
continent_destination                                                   
AF                      1011    774   2656   10089   100    162    14792
AS                       993  13534   7456   56735  2030   1192    81940
EU                      2111   4775  28792  100851  1093   7303   144925
NorthA                 12096   9364  17575  663988  1559  15153   719735
OC                       174   1139   1047    6893  2449    471    12173
SA                       142    237   2022   20747   144  10835    34127
All                    16527  29823  59548  859303  7375  35116  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 298463.39
D.f: 25


Effect size
Cramer's V: 0.24


------------ continent_origin and country_destination ------------

continent_origin : 6 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 398941.06
D.f: 1040


Effect size
Cramer's V: 0.28


------------ continent_origin and Domestic or international ------------

continent_origin : 6 unique values.
Domestic or international : 2 unique values.
continent_origin              AF     AS     EU  NorthA    OC     SA      All
Domestic or international                                                   
Domestic                     473   2155   2947  504576  1579   5022   516752
International              16054  27668  56601  354727  5796  30094   490940
All                        16527  29823  59548  859303  7375  35116  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 130726.42
D.f: 5


Effect size
Cramer's V: 0.36


------------ continent_origin and Region ------------

continent_origin : 6 unique values.
Region : 8 unique values.
continent_origin               AF     AS     EU  NorthA    OC     SA      All
Region                                                                       
East Asia & Pacific             0  16433      0       0  7372      0    23805
Europe & Central Asia           0    830  59440       0     0      0    60270
Latin America & Caribbean       0      0      0   51928     0  35114    87042
Middle East & North Africa    816   7740    102       0     0      0     8658
North America                   0      0      0  807262     0      0   807262
South Asia                      0   4202      0       0     0      0     4202
Sub-Saharan Africa          15664      0      0       0     0      0    15664
nan                            47    618      6     113     3      2      789
All                         16527  29823  59548  859303  7375  35116  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 3421545.28
D.f: 35


Effect size
Cramer's V: 0.82


------------ continent_origin and IncomeGroup ------------

continent_origin : 6 unique values.
IncomeGroup : 5 unique values.
continent_origin        AF     AS     EU  NorthA    OC     SA      All
IncomeGroup                                                           
High income              5  13549  57734  828433  7316   3612   910649
Low income             292    122      0     278     0      0      692
Lower middle income   1484   9480    311    4116    14    562    15967
Upper middle income  14699   6054   1497   26363    42  30940    79595
nan                     47    618      6     113     3      2      789
All                  16527  29823  59548  859303  7375  35116  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 715623.92
D.f: 20


Effect size
Cramer's V: 0.42


------------ continent_origin and outcome ------------

continent_origin : 6 unique values.
outcome : 3 unique values.
continent_origin     AF     AS     EU  NorthA    OC     SA      All
outcome                                                            
expected           2329   3504   8691  115556  1085   4804   135969
gained            10493  20713  37869  562185  4752  23118   659130
lost               3705   5606  12988  181562  1538   7194   212593
All               16527  29823  59548  859303  7375  35116  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 361.08
D.f: 10


Effect size
Cramer's V: 0.01


------------ City origin and City destination ------------

City origin : 1237 unique values.
City destination : 1487 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 14552724.99
D.f: 1836696


Effect size
Cramer's V: 0.11


------------ City origin and region_origin ------------

City origin : 1237 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 437927069.37
D.f: 558672


Effect size
Cramer's V: 0.98


------------ City origin and region_destination ------------

City origin : 1237 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5541398.34
D.f: 614292


Effect size
Cramer's V: 0.11


------------ City origin and country_origin ------------

City origin : 1237 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 185354945.85
D.f: 231132


Effect size
Cramer's V: 0.99


------------ City origin and continent_destination ------------

City origin : 1237 unique values.
continent_destination : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 452932.9
D.f: 6180


Effect size
Cramer's V: 0.3


------------ City origin and country_destination ------------

City origin : 1237 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2447098.69
D.f: 257088


Effect size
Cramer's V: 0.11


------------ City origin and Domestic or international ------------

City origin : 1237 unique values.
Domestic or international : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 295496.29
D.f: 1236


Effect size
Cramer's V: 0.54


------------ City origin and Region ------------

City origin : 1237 unique values.
Region : 8 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 6976614.73
D.f: 8652


Effect size
Cramer's V: 0.99


------------ City origin and IncomeGroup ------------

City origin : 1237 unique values.
IncomeGroup : 5 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3978232.18
D.f: 4944


Effect size
Cramer's V: 0.99


------------ City origin and outcome ------------

City origin : 1237 unique values.
outcome : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7142.62
D.f: 2472


Effect size
Cramer's V: 0.06


------------ City destination and region_origin ------------

City destination : 1487 unique values.
region_origin : 453 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 6673185.65
D.f: 671672


Effect size
Cramer's V: 0.12


------------ City destination and region_destination ------------

City destination : 1487 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 482080166.23
D.f: 738542


Effect size
Cramer's V: 0.98


------------ City destination and country_origin ------------

City destination : 1487 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3213619.6
D.f: 277882


Effect size
Cramer's V: 0.13


------------ City destination and continent_destination ------------

City destination : 1487 unique values.
continent_destination : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4960803.6
D.f: 7430


Effect size
Cramer's V: 0.99


------------ City destination and country_destination ------------

City destination : 1487 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 206909300.84
D.f: 309088


Effect size
Cramer's V: 0.99


------------ City destination and Domestic or international ------------

City destination : 1487 unique values.
Domestic or international : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 628387.01
D.f: 1486


Effect size
Cramer's V: 0.79


------------ City destination and Region ------------

City destination : 1487 unique values.
Region : 8 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 604117.78
D.f: 10402


Effect size
Cramer's V: 0.29


------------ City destination and IncomeGroup ------------

City destination : 1487 unique values.
IncomeGroup : 5 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 166858.98
D.f: 5944


Effect size
Cramer's V: 0.2


------------ City destination and outcome ------------

City destination : 1487 unique values.
outcome : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 12739.85
D.f: 2972


Effect size
Cramer's V: 0.08


------------ region_origin and region_destination ------------

region_origin : 453 unique values.
region_destination : 498 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3825269.06
D.f: 224644


Effect size
Cramer's V: 0.09


------------ region_origin and country_origin ------------

region_origin : 453 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 99553016.33
D.f: 84524


Effect size
Cramer's V: 0.73


------------ region_origin and continent_destination ------------

region_origin : 453 unique values.
continent_destination : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 365684.32
D.f: 2260


Effect size
Cramer's V: 0.27


------------ region_origin and country_destination ------------

region_origin : 453 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1525303.61
D.f: 94016


Effect size
Cramer's V: 0.09


------------ region_origin and Domestic or international ------------

region_origin : 453 unique values.
Domestic or international : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 266668.75
D.f: 452


Effect size
Cramer's V: 0.51


------------ region_origin and Region ------------

region_origin : 453 unique values.
Region : 8 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5679553.71
D.f: 3164


Effect size
Cramer's V: 0.9


------------ region_origin and IncomeGroup ------------

region_origin : 453 unique values.
IncomeGroup : 5 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2832378.21
D.f: 1808


Effect size
Cramer's V: 0.84


------------ region_origin and outcome ------------

region_origin : 453 unique values.
outcome : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 3925.7
D.f: 904


Effect size
Cramer's V: 0.04


------------ region_destination and country_origin ------------

region_destination : 498 unique values.
country_origin : 188 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1790597.54
D.f: 92939


Effect size
Cramer's V: 0.1


------------ region_destination and continent_destination ------------

region_destination : 498 unique values.
continent_destination : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4209379.45
D.f: 2485


Effect size
Cramer's V: 0.91


------------ region_destination and country_destination ------------

region_destination : 498 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 107420517.61
D.f: 103376


Effect size
Cramer's V: 0.72


------------ region_destination and Domestic or international ------------

region_destination : 498 unique values.
Domestic or international : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 601591.11
D.f: 497


Effect size
Cramer's V: 0.77


------------ region_destination and Region ------------

region_destination : 498 unique values.
Region : 8 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 434033.35
D.f: 3479


Effect size
Cramer's V: 0.25


------------ region_destination and IncomeGroup ------------

region_destination : 498 unique values.
IncomeGroup : 5 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 115182.45
D.f: 1988


Effect size
Cramer's V: 0.17


------------ region_destination and outcome ------------

region_destination : 498 unique values.
outcome : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7970.65
D.f: 994


Effect size
Cramer's V: 0.06


------------ country_origin and continent_destination ------------

country_origin : 188 unique values.
continent_destination : 6 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 368861.0
D.f: 935


Effect size
Cramer's V: 0.27


------------ country_origin and country_destination ------------

country_origin : 188 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 1364649.57
D.f: 38896


Effect size
Cramer's V: 0.09


------------ country_origin and Domestic or international ------------

country_origin : 188 unique values.
Domestic or international : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 254119.97
D.f: 187


Effect size
Cramer's V: 0.5


------------ country_origin and Region ------------

country_origin : 188 unique values.
Region : 8 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 7053844.0
D.f: 1309


Effect size
Cramer's V: 1.0


------------ country_origin and IncomeGroup ------------

country_origin : 188 unique values.
IncomeGroup : 5 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 4030768.0
D.f: 748


Effect size
Cramer's V: 1.0


------------ country_origin and outcome ------------

country_origin : 188 unique values.
outcome : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 2014.3
D.f: 374


Effect size
Cramer's V: 0.03


------------ continent_destination and country_destination ------------

continent_destination : 6 unique values.
country_destination : 209 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5033241.42
D.f: 1040


Effect size
Cramer's V: 1.0


------------ continent_destination and Domestic or international ------------

continent_destination : 6 unique values.
Domestic or international : 2 unique values.
continent_destination         AF     AS      EU  NorthA     OC     SA      All
Domestic or international                                                     
Domestic                     476   2156    2943  504576   1579   5022   516752
International              14316  79784  141982  215159  10594  29105   490940
All                        14792  81940  144925  719735  12173  34127  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 359509.51
D.f: 5


Effect size
Cramer's V: 0.6


------------ continent_destination and Region ------------

continent_destination : 6 unique values.
Region : 8 unique values.
continent_destination          AF     AS      EU  NorthA     OC     SA  \
Region                                                                   
East Asia & Pacific           239  10383    2734    6830   3350    269   
Europe & Central Asia        2687   7644   29077   17759   1060   2043   
Latin America & Caribbean     631   2439   15098   53955    655  14264   
Middle East & North Africa    658   3405    2723    1693     88     91   
North America                9620  55486   93049  625081   6709  17317   
South Asia                     21   1391     422    2272     93      3   
Sub-Saharan Africa            918    819    1757   11864    172    134   
nan                            18    373      65     281     46      6   
All                         14792  81940  144925  719735  12173  34127   

continent_destination           All  
Region                               
East Asia & Pacific           23805  
Europe & Central Asia         60270  
Latin America & Caribbean     87042  
Middle East & North Africa     8658  
North America                807262  
South Asia                     4202  
Sub-Saharan Africa            15664  
nan                             789  
All                         1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 234076.83
D.f: 35


Effect size
Cramer's V: 0.22


------------ continent_destination and IncomeGroup ------------

continent_destination : 6 unique values.
IncomeGroup : 5 unique values.
continent_destination     AF     AS      EU  NorthA     OC     SA      All
IncomeGroup                                                               
High income            13254  71872  127569  665139  10839  21976   910649
Low income               103     85      69     415      1     19      692
Lower middle income      266   4131    2060    8520    444    546    15967
Upper middle income     1151   5479   15162   45380    843  11580    79595
nan                       18    373      65     281     46      6      789
All                    14792  81940  144925  719735  12173  34127  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 45971.34
D.f: 20


Effect size
Cramer's V: 0.11


------------ continent_destination and outcome ------------

continent_destination : 6 unique values.
outcome : 3 unique values.
continent_destination     AF     AS      EU  NorthA     OC     SA      All
outcome                                                                   
expected                2103  10557   21643   95337   1792   4537   135969
gained                  9722  57634   96639  464188   8203  22744   659130
lost                    2967  13749   26643  160210   2178   6846   212593
All                    14792  81940  144925  719735  12173  34127  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2521.05
D.f: 10


Effect size
Cramer's V: 0.04


------------ country_destination and Domestic or international ------------

country_destination : 209 unique values.
Domestic or international : 2 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 606139.21
D.f: 208


Effect size
Cramer's V: 0.78


------------ country_destination and Region ------------

country_destination : 209 unique values.
Region : 8 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 390561.05
D.f: 1456


Effect size
Cramer's V: 0.24


------------ country_destination and IncomeGroup ------------

country_destination : 209 unique values.
IncomeGroup : 5 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 88945.59
D.f: 832


Effect size
Cramer's V: 0.15


------------ country_destination and outcome ------------

country_destination : 209 unique values.
outcome : 3 unique values.

Too many values to print contingency table.


X^2 TEST

p-value to reject null: 0.0
X^2: 5351.55
D.f: 416


Effect size
Cramer's V: 0.05


------------ Domestic or international and Region ------------

Domestic or international : 2 unique values.
Region : 8 unique values.
Domestic or international   Domestic  International      All
Region                                                      
East Asia & Pacific             3051          20754    23805
Europe & Central Asia           2993          57277    60270
Latin America & Caribbean       8565          78477    87042
Middle East & North Africa       114           8544     8658
North America                 501033         306229   807262
South Asia                       547           3655     4202
Sub-Saharan Africa               449          15215    15664
nan                                0            789      789
All                           516752         490940  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 189898.64
D.f: 7


Effect size
Cramer's V: 0.43


------------ Domestic or international and IncomeGroup ------------

Domestic or international : 2 unique values.
IncomeGroup : 5 unique values.
Domestic or international  Domestic  International      All
IncomeGroup                                                
High income                  506093         404556   910649
Low income                        9            683      692
Lower middle income            1240          14727    15967
Upper middle income            9410          70185    79595
nan                               0            789      789
All                          516752         490940  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 69948.65
D.f: 4


Effect size
Cramer's V: 0.26


------------ Domestic or international and outcome ------------

Domestic or international : 2 unique values.
outcome : 3 unique values.
Domestic or international  Domestic  International      All
outcome                                                    
expected                      67770          68199   135969
gained                       330349         328781   659130
lost                         118633          93960   212593
All                          516752         490940  1007692


X^2 TEST

p-value to reject null: 0.0
X^2: 2208.84
D.f: 2


Effect size
Cramer's V: 0.05


------------ Region and IncomeGroup ------------

Region : 8 unique values.
IncomeGroup : 5 unique values.
Region               East Asia & Pacific  Europe & Central Asia  \
IncomeGroup                                                       
High income                        14011                  57720   
Low income                             0                      0   
Lower middle income                 5531                    323   
Upper middle income                 4263                   2227   
nan                                    0                      0   
All                                23805                  60270   

Region               Latin America & Caribbean  Middle East & North Africa  \
IncomeGroup                                                                  
High income                              24783                        6868   
Low income                                 278                           0   
Lower middle income                       4678                         802   
Upper middle income                      57303                         988   
nan                                          0                           0   
All                                      87042                        8658   

Region               North America  South Asia  Sub-Saharan Africa  nan  \
IncomeGroup                                                               
High income                 807262           0                   5    0   
Low income                       0         122                 292    0   
Lower middle income              0        3938                 695    0   
Upper middle income              0         142               14672    0   
nan                              0           0                   0  789   
All                         807262        4202               15664  789   

Region                   All  
IncomeGroup                   
High income           910649  
Low income               692  
Lower middle income    15967  
Upper middle income    79595  
nan                      789  
All                  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 1992213.8
D.f: 28


Effect size
Cramer's V: 0.7


------------ Region and outcome ------------

Region : 8 unique values.
outcome : 3 unique values.
Region    East Asia & Pacific  Europe & Central Asia  \
outcome                                                
expected                 3115                   8803   
gained                  16086                  38324   
lost                     4604                  13143   
All                     23805                  60270   

Region    Latin America & Caribbean  Middle East & North Africa  \
outcome                                                           
expected                      11578                         907   
gained                        57430                        6162   
lost                          18034                        1589   
All                           87042                        8658   

Region    North America  South Asia  Sub-Saharan Africa  nan      All  
outcome                                                                
expected         108763         473                2232   98   135969  
gained           527797        2940                9861  530   659130  
lost             170702         789                3571  161   212593  
All              807262        4202               15664  789  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 385.63
D.f: 14


Effect size
Cramer's V: 0.01


------------ IncomeGroup and outcome ------------

IncomeGroup : 5 unique values.
outcome : 3 unique values.
IncomeGroup  High income  Low income  Lower middle income  \
outcome                                                     
expected          123050          73                 1840   
gained            594888         477                11257   
lost              192711         142                 2870   
All               910649         692                15967   

IncomeGroup  Upper middle income  nan      All  
outcome                                         
expected                   10908   98   135969  
gained                     51978  530   659130  
lost                       16709  161   212593  
All                        79595  789  1007692  


X^2 TEST

p-value to reject null: 0.0
X^2: 196.3
D.f: 8


Effect size
Cramer's V: 0.01
In [15]:
categorical_correlations = categorical_correlations.fillna(0)
categorical_correlations.style.background_gradient(cmap='Blues', axis=None)
Out[15]:
origin_city destination_city trip_type weekend filter_no_lcc filter_non_stop filter_short_layover filter_name first_rec last_rec is_session_1 Search or watch Use frequency continent_origin City origin City destination region_origin region_destination country_origin continent_destination country_destination Domestic or international Region IncomeGroup outcome
origin_city 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
destination_city 0.113569 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
trip_type 0.250939 0.189375 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
weekend 0.188665 0.311903 0.223129 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
filter_no_lcc 0.0551823 0.0622049 0.00345676 0.0100393 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
filter_non_stop 0.140617 0.139931 0 0.0520667 0.139641 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
filter_short_layover 0.0889076 0.110387 0.00323824 0.0278636 0.0917811 0.0601358 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
filter_name 0.0809152 0.0862918 0.00666794 0.0580056 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
first_rec 0.110209 0.13466 0.0145079 0.0340823 0.0429472 0.0509839 0.0200045 0.0465326 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
last_rec 0.0976916 0.119561 0.00994136 0.00924758 0.0380062 0.0264573 0.0162132 0.0329097 0.907649 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
is_session_1 0.112731 0.0851717 0.0570345 0.00420952 0.00346529 0.00843189 0.0120254 0.0169052 0.0522848 0.0525428 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Search or watch 0.0702433 0.0871007 0.0141817 0.0473606 0.0495041 0.147157 0.0834718 0.175005 0.181198 0.111532 0.142807 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Use frequency 0.0905946 0.0727718 0.0420551 0.00622556 0.0104085 0.0147801 0.00732145 0.0186162 0.0105593 0.0191198 0.298281 0.0784814 0 0 0 0 0 0 0 0 0 0 0 0 0
continent_origin 1 0.34552 0.122024 0.115361 0.0159211 0.0386458 0.0108232 0.0189411 0.0553329 0.0418822 0.0489804 0.0184665 0.0409197 0 0 0 0 0 0 0 0 0 0 0 0
City origin 1 0.109921 0.248224 0.183669 0.0529629 0.137825 0.0863381 0.0787917 0.108051 0.0955754 0.110113 0.0649639 0.0884338 0.992796 0 0 0 0 0 0 0 0 0 0 0
City destination 0.111665 1 0.185186 0.309684 0.0609374 0.137114 0.108542 0.0844851 0.132791 0.117849 0.0813158 0.084164 0.0706064 0.340578 0.108093 0 0 0 0 0 0 0 0 0 0
region_origin 1 0.123453 0.223306 0.170519 0.0433939 0.114533 0.0582544 0.0617396 0.0888864 0.0762055 0.0946975 0.0494072 0.0735966 0.92804 0.980546 0.121041 0 0 0 0 0 0 0 0 0
region_destination 0.110787 1 0.153781 0.294371 0.0493888 0.115038 0.0829751 0.0675796 0.106929 0.090247 0.062591 0.0670197 0.049682 0.293829 0.105188 0.98111 0.0916427 0 0 0 0 0 0 0 0
country_origin 1 0.132785 0.181202 0.158949 0.0345896 0.0634845 0.0351297 0.0363993 0.073781 0.0610003 0.075962 0.0396161 0.0595235 0.999812 0.991785 0.130591 0.726846 0.0974797 0 0 0 0 0 0 0
continent_destination 0.305885 1 0.036969 0.228586 0.0238164 0.07256 0.0290019 0.0352661 0.0535833 0.032058 0.0206251 0.0368309 0.0179201 0.243387 0.299825 0.992264 0.269404 0.914029 0.270572 0 0 0 0 0 0
country_destination 0.111374 1 0.116385 0.257137 0.0395197 0.0949796 0.0648361 0.0543431 0.0814436 0.0655156 0.0465135 0.0569379 0.0365587 0.281388 0.108051 0.993561 0.0853066 0.715893 0.0850993 0.999482 0 0 0 0 0
Domestic or international 0.554191 0.796364 0.010421 0.271384 0.0336368 0.0689211 0.0468113 0.0861626 0.0492292 0.0173071 0.0133521 0.0334192 0.00470323 0.360178 0.541517 0.789677 0.514425 0.772657 0.502175 0.597298 0.775572 0 0 0 0
Region 1 0.296897 0.131034 0.132063 0.0195443 0.0451362 0.015723 0.022655 0.0610385 0.0472528 0.0434758 0.0190218 0.040259 0.824066 0.994511 0.29265 0.897313 0.248055 1 0.215541 0.235305 0.434107 0 0 0
IncomeGroup 1 0.208374 0.0601226 0.088017 0.0145253 0.0315288 0.00819696 0.0173041 0.0255945 0.0208998 0.0228567 0.0140144 0.0250496 0.421355 0.993462 0.203461 0.838266 0.169044 1 0.106795 0.148549 0.263467 0.70303 0 0
outcome 0.0631787 0.0815776 0.0310033 0.0494215 0.0538591 0.145482 0.0828546 0.123384 0.138129 0.103548 0.146406 0.993581 0.0869744 0.0133852 0.0595319 0.0795066 0.0441346 0.062888 0.0316143 0.0353681 0.0515301 0.0468186 0.0138327 0.00986921 0

Numerical variables

In [15]:
df.head(4)
Out[15]:
transaction_id origin_city destination_city user_id trip_id trip_type departure_date return_date stay weekend filter_no_lcc filter_non_stop filter_short_layover filter_name status_updates first_search_dt watch_added_dt latest_status_change_dt status_latest total_notifs total_buy_notifs first_rec first_total last_rec last_total first_buy_dt first_buy_total lowest_total Use frequency outcome ordered session diff_day diff Search or watch first_buy - lowest_total days_to_departure latitude_deg_origin longitude_deg_origin continent_origin City origin latitude_deg_destination longitude_deg_destination continent_destination City destination region_origin region_destination country_origin country_destination Domestic or international Adult population Country name Capital origin name Count_uniq_users_per_country Percent unique users to country adult pop Country Code Region IncomeGroup Passengers carried Q1 Percent unique users to passengers carried by country is_session_1 is_US is_CA
0 17 MEX YUL e42e7c15cde08c19905ee12200fad7cb5af36d1fe3a331... 05d59806e67fa9a5b2747bc1b24842189bba0c45e49d37... round_trip 2018-04-06 00:00:00 2018-04-27 00:00:00 21.0 0 0 1 0 NonStop 5 2018-03-15 21:29:00 2018-03-16 18:02:00 2018-04-07 05:02:00 expired 6.0 5.0 buy 455.0 buy 566.0 2018-03-16 18:00:00 455.0 455.0 1 lost 1 1 0 0 days 00:00:00.000000000 watch 0.0 21 19.436300 -99.072098 NorthA Mexico City 45.470600 -73.740799 NorthA Montréal DIF QC MX CA International 81522558.0 Mexico Mexico City 10662 0.013079 MEX Latin America & Caribbean Upper middle income 16142410.0 0.066050 True False False
1 27 ALB FLL 4eb7c43c5afdaf75d8ac5f3f92b5e02e4cd4ab716e7e4a... 0a44613d689f7fafc7c5ab6fae7cb24ff2fd8b0fb162f6... round_trip 2018-03-09 00:00:00 2018-03-16 00:00:00 7.0 0 0 0 1 ShortLayover 1 2018-02-24 14:36:00 NaN 2018-02-24 14:36:00 shopped 0.0 0.0 buy 438.0 buy 438.0 2018-02-24 14:36:00 438.0 438.0 1 gained 1 1 0 0 days 00:00:00.000000000 search 0.0 12 42.748299 -73.801697 NorthA Albany 26.072599 -80.152702 NorthA Fort Lauderdale NY FL US US Domestic 244635911.0 United States Washington 585228 0.239224 USA North America High income 222255500.0 0.263313 True True False
2 74 GEG BZE 74ad865b8d1fc01eb27c98ac66db0eb27d8a25c4c04b3a... a5ad5dc889433c1642318fc8b5233c87bf10f5ed19f906... round_trip 2018-05-22 00:00:00 2018-06-05 00:00:00 14.0 0 0 0 0 NoFilter 4 2018-03-19 23:14:00 2018-03-19 23:14:00 2018-03-19 23:14:00 inactive 0.0 0.0 buy 615.0 buy 615.0 2018-03-19 23:14:00 615.0 615.0 1 lost 1 1 0 0 days 00:00:00.000000000 watch 0.0 63 47.619900 -117.533997 NorthA Spokane 17.539101 -88.308197 NorthA Belize City WA BZ US BZ International 244635911.0 United States Washington 585228 0.239224 USA North America High income 222255500.0 0.263313 True True False
3 82 YUL FLL a0b7a4d6c6ddbbc14980b98c2ee74d29e6ede0141fdf4b... 72b0cf11dd4baeb987a3a7d95916b5eb87e699cfe82e85... round_trip 2018-04-14 00:00:00 2018-04-20 00:00:00 6.0 0 0 0 0 NoFilter 1 2018-04-07 19:36:00 NaN 2018-04-07 19:36:00 shopped 0.0 0.0 buy 358.0 buy 358.0 2018-04-07 19:36:00 358.0 358.0 1 gained 1 1 0 0 days 00:00:00.000000000 search 0.0 6 45.470600 -73.740799 NorthA Montréal 26.072599 -80.152702 NorthA Fort Lauderdale QC FL CA US International 29156938.0 Canada Ottawa 50409 0.172889 CAN North America High income 22345000.0 0.225594 True False True
In [ ]:
df['Days_to_departure'] = df['departure_date']
In [16]:
numerical = ["stay","status_updates","total_notifs","total_buy_notifs", "first_total",
             "first_buy_total","lowest_total","session","first_buy - lowest_total",
             "Adult population","Count_uniq_users_per_country","Passengers carried Q1"]
In [17]:
numerical_correlations = pd.DataFrame(index=numerical, columns=numerical)

Summarizing numerical variables against each other

In [18]:
 for i,j in combinations(numerical,2):
        p, effect = summarize_numerical(df,i,j)
        if p < .05:
            numerical_correlations[i][j] = effect

------------ stay and status_updates ------------

 Pearson's R: -0.0

 p-value to reject null: 0.13


------------ stay and total_notifs ------------

 Pearson's R: -0.02

 p-value to reject null: 0.0


------------ stay and total_buy_notifs ------------

 Pearson's R: -0.01

 p-value to reject null: 0.0


------------ stay and first_total ------------

 Pearson's R: 0.3

 p-value to reject null: 0.0


------------ stay and first_buy_total ------------

 Pearson's R: 0.3

 p-value to reject null: 0.0


------------ stay and lowest_total ------------

 Pearson's R: 0.3

 p-value to reject null: 0.0


------------ stay and session ------------

 Pearson's R: -0.0

 p-value to reject null: 0.23


------------ stay and first_buy - lowest_total ------------

 Pearson's R: 0.03

 p-value to reject null: 0.0


------------ stay and Adult population ------------

 Pearson's R: -0.09

 p-value to reject null: 0.0


------------ stay and Count_uniq_users_per_country ------------

 Pearson's R: -0.19

 p-value to reject null: 0.0


------------ stay and Passengers carried Q1 ------------

 Pearson's R: -0.19

 p-value to reject null: 0.0


------------ status_updates and total_notifs ------------

 Pearson's R: 0.37

 p-value to reject null: 0.0


------------ status_updates and total_buy_notifs ------------

 Pearson's R: 0.33

 p-value to reject null: 0.0


------------ status_updates and first_total ------------

 Pearson's R: -0.03

 p-value to reject null: 0.0


------------ status_updates and first_buy_total ------------

 Pearson's R: -0.03

 p-value to reject null: 0.0


------------ status_updates and lowest_total ------------

 Pearson's R: -0.06

 p-value to reject null: 0.0


------------ status_updates and session ------------

 Pearson's R: -0.12

 p-value to reject null: 0.0


------------ status_updates and first_buy - lowest_total ------------

 Pearson's R: 0.14

 p-value to reject null: 0.0


------------ status_updates and Adult population ------------

 Pearson's R: -0.01

 p-value to reject null: 0.0


------------ status_updates and Count_uniq_users_per_country ------------

 Pearson's R: -0.0

 p-value to reject null: 0.0


------------ status_updates and Passengers carried Q1 ------------

 Pearson's R: -0.0

 p-value to reject null: 0.03


------------ total_notifs and total_buy_notifs ------------

 Pearson's R: 0.89

 p-value to reject null: 0.0


------------ total_notifs and first_total ------------

 Pearson's R: -0.01

 p-value to reject null: 0.0


------------ total_notifs and first_buy_total ------------

 Pearson's R: -0.01

 p-value to reject null: 0.0


------------ total_notifs and lowest_total ------------

 Pearson's R: -0.06

 p-value to reject null: 0.0


------------ total_notifs and session ------------

 Pearson's R: -0.12

 p-value to reject null: 0.0


------------ total_notifs and first_buy - lowest_total ------------

 Pearson's R: 0.24

 p-value to reject null: 0.0


------------ total_notifs and Adult population ------------

 Pearson's R: 0.01

 p-value to reject null: 0.0


------------ total_notifs and Count_uniq_users_per_country ------------

 Pearson's R: 0.01

 p-value to reject null: 0.0


------------ total_notifs and Passengers carried Q1 ------------

 Pearson's R: 0.0

 p-value to reject null: 0.0


------------ total_buy_notifs and first_total ------------

 Pearson's R: -0.01

 p-value to reject null: 0.0


------------ total_buy_notifs and first_buy_total ------------

 Pearson's R: -0.0

 p-value to reject null: 0.0


------------ total_buy_notifs and lowest_total ------------

 Pearson's R: -0.05

 p-value to reject null: 0.0


------------ total_buy_notifs and session ------------

 Pearson's R: -0.1

 p-value to reject null: 0.0


------------ total_buy_notifs and first_buy - lowest_total ------------

 Pearson's R: 0.24

 p-value to reject null: 0.0


------------ total_buy_notifs and Adult population ------------

 Pearson's R: 0.0

 p-value to reject null: 0.58


------------ total_buy_notifs and Count_uniq_users_per_country ------------

 Pearson's R: -0.0

 p-value to reject null: 0.0


------------ total_buy_notifs and Passengers carried Q1 ------------

 Pearson's R: -0.0

 p-value to reject null: 0.0


------------ first_total and first_buy_total ------------

 Pearson's R: 0.99

 p-value to reject null: 0.0


------------ first_total and lowest_total ------------

 Pearson's R: 0.99

 p-value to reject null: 0.0


------------ first_total and session ------------

 Pearson's R: -0.02

 p-value to reject null: 0.0


------------ first_total and first_buy - lowest_total ------------

 Pearson's R: 0.21

 p-value to reject null: 0.0


------------ first_total and Adult population ------------

 Pearson's R: -0.12

 p-value to reject null: 0.0


------------ first_total and Count_uniq_users_per_country ------------

 Pearson's R: -0.17

 p-value to reject null: 0.0


------------ first_total and Passengers carried Q1 ------------

 Pearson's R: -0.18

 p-value to reject null: 0.0


------------ first_buy_total and lowest_total ------------

 Pearson's R: 0.99

 p-value to reject null: 0.0


------------ first_buy_total and session ------------

 Pearson's R: -0.02

 p-value to reject null: 0.0


------------ first_buy_total and first_buy - lowest_total ------------

 Pearson's R: 0.23

 p-value to reject null: 0.0


------------ first_buy_total and Adult population ------------

 Pearson's R: -0.1

 p-value to reject null: 0.0


------------ first_buy_total and Count_uniq_users_per_country ------------

 Pearson's R: -0.15

 p-value to reject null: 0.0


------------ first_buy_total and Passengers carried Q1 ------------

 Pearson's R: -0.17

 p-value to reject null: 0.0


------------ lowest_total and session ------------

 Pearson's R: -0.01

 p-value to reject null: 0.0


------------ lowest_total and first_buy - lowest_total ------------

 Pearson's R: 0.06

 p-value to reject null: 0.0


------------ lowest_total and Adult population ------------

 Pearson's R: -0.12

 p-value to reject null: 0.0


------------ lowest_total and Count_uniq_users_per_country ------------

 Pearson's R: -0.17

 p-value to reject null: 0.0


------------ lowest_total and Passengers carried Q1 ------------

 Pearson's R: -0.18

 p-value to reject null: 0.0


------------ session and first_buy - lowest_total ------------

 Pearson's R: -0.04

 p-value to reject null: 0.0


------------ session and Adult population ------------

 Pearson's R: 0.02

 p-value to reject null: 0.0


------------ session and Count_uniq_users_per_country ------------

 Pearson's R: 0.04

 p-value to reject null: 0.0


------------ session and Passengers carried Q1 ------------

 Pearson's R: 0.04

 p-value to reject null: 0.0


------------ first_buy - lowest_total and Adult population ------------

 Pearson's R: -0.0

 p-value to reject null: 0.59


------------ first_buy - lowest_total and Count_uniq_users_per_country ------------

 Pearson's R: -0.0

 p-value to reject null: 0.02


------------ first_buy - lowest_total and Passengers carried Q1 ------------

 Pearson's R: -0.01

 p-value to reject null: 0.0


------------ Adult population and Count_uniq_users_per_country ------------

 Pearson's R: 0.78

 p-value to reject null: 0.0


------------ Adult population and Passengers carried Q1 ------------

 Pearson's R: 0.81

 p-value to reject null: 0.0


------------ Count_uniq_users_per_country and Passengers carried Q1 ------------

 Pearson's R: 1.0

 p-value to reject null: 0.0
In [19]:
numerical_correlations = numerical_correlations.fillna(0)
numerical_correlations.style.background_gradient(cmap='Blues', axis=None)
Out[19]:
stay status_updates total_notifs total_buy_notifs first_total first_buy_total lowest_total session first_buy - lowest_total Adult population Count_uniq_users_per_country Passengers carried Q1
stay 0 0 0 0 0 0 0 0 0 0 0 0
status_updates 0 0 0 0 0 0 0 0 0 0 0 0
total_notifs -0.0154445 0.372897 0 0 0 0 0 0 0 0 0 0
total_buy_notifs -0.0101269 0.333896 0.887629 0 0 0 0 0 0 0 0 0
first_total 0.297157 -0.0291507 -0.00730957 -0.00953292 0 0 0 0 0 0 0 0
first_buy_total 0.299473 -0.0262065 -0.00955931 -0.00423984 0.990864 0 0 0 0 0 0 0
lowest_total 0.298025 -0.0571499 -0.057589 -0.0513747 0.987753 0.986177 0 0 0 0 0 0
session 0 -0.115988 -0.120906 -0.100057 -0.0193853 -0.0181638 -0.0129198 0 0 0 0 0
first_buy - lowest_total 0.0261217 0.141343 0.240642 0.244105 0.205797 0.226673 0.0621552 -0.038378 0 0 0 0
Adult population -0.0853436 -0.00573711 0.00508623 0 -0.117438 -0.095133 -0.119279 0.0238836 0 0 0 0
Count_uniq_users_per_country -0.190553 -0.00434965 0.00583649 -0.00449199 -0.165684 -0.149245 -0.168337 0.0393756 -0.00307738 0.784671 0 0
Passengers carried Q1 -0.193221 -0.00220959 0.00427519 -0.00460289 -0.18065 -0.167473 -0.183184 0.0405854 -0.00579244 0.811608 0.995722 0

Summarizing categorical vs. numerical variables

In [20]:
numcat_correlations = pd.DataFrame(index=categorical, columns=numerical)
In [29]:
for i in numerical:
    for j in categorical:
        p,effect = summarize_numerical_categorical(df, num = i, cat = j, plot=True,summary=True,hist=True)
        if p < .05:
            numcat_correlations[i][j] = effect

------------ stay by origin_city ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ stay by destination_city ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ stay by trip_type ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: nan
t-statistic value: nan


Effect size
Cohen's d: nan


------------ stay by weekend ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 216.63


Effect size
Cohen's d: 0.68


------------ stay by filter_no_lcc ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 10.23


Effect size
Cohen's d: 0.11


------------ stay by filter_non_stop ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -34.99


Effect size
Cohen's d: 0.14


------------ stay by filter_short_layover ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -29.98


Effect size
Cohen's d: 0.17


------------ stay by filter_name ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 5824.96


Effect size
Eta^2: 0.01


------------ stay by first_rec ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 146.92


Effect size
Eta^2: 0.0


------------ stay by last_rec ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 268.56


Effect size
Eta^2: 0.0


------------ stay by is_session_1 ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 8.54


Effect size
Cohen's d: 0.02


------------ stay by Search or watch ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: 0.0024
t-statistic value: -3.03


Effect size
Cohen's d: 0.01


------------ stay by Use frequency ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 5.95


Effect size
Cohen's d: 0.02


------------ stay by continent_origin ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 28526.37


Effect size
Eta^2: 0.03


------------ stay by City origin ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ stay by City destination ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ stay by region_origin ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ stay by region_destination ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ stay by country_origin ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ stay by continent_destination ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 166144.58


Effect size
Eta^2: 0.2


------------ stay by country_destination ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 201683.66


Effect size
Eta^2: 0.24


------------ stay by Domestic or international ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 264.28


Effect size
Cohen's d: 0.57


------------ stay by Region ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 34309.42


Effect size
Eta^2: 0.04


------------ stay by IncomeGroup ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 19325.03


Effect size
Eta^2: 0.02


------------ stay by outcome ------------

count    838920.000000
mean          8.862229
std          14.654277
min           0.000000
25%           3.000000
50%           5.000000
75%           9.000000
max         351.000000
Name: stay, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2100.55


Effect size
Eta^2: 0.0


------------ status_updates by origin_city ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 4665.43


Effect size
Eta^2: 0.0


------------ status_updates by destination_city ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 7607.39


Effect size
Eta^2: 0.01


------------ status_updates by trip_type ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.9089
t-statistic value: 0.11


Effect size
Cohen's d: 0.0


------------ status_updates by weekend ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -33.04


Effect size
Cohen's d: 0.08


------------ status_updates by filter_no_lcc ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -46.57


Effect size
Cohen's d: 0.37


------------ status_updates by filter_non_stop ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 135.33


Effect size
Cohen's d: 0.39


------------ status_updates by filter_short_layover ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -66.18


Effect size
Cohen's d: 0.36


------------ status_updates by filter_name ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 29990.24


Effect size
Eta^2: 0.03


------------ status_updates by first_rec ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 26561.27


Effect size
Eta^2: 0.03


------------ status_updates by last_rec ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 10650.08


Effect size
Eta^2: 0.01


------------ status_updates by is_session_1 ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 139.79


Effect size
Cohen's d: 0.28


------------ status_updates by Search or watch ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 1217.04


Effect size
Cohen's d: 1


------------ status_updates by Use frequency ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 50.49


Effect size
Cohen's d: 0.19


------------ status_updates by continent_origin ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 306.43


Effect size
Eta^2: 0.0


------------ status_updates by City origin ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 4003.86


Effect size
Eta^2: 0.0


------------ status_updates by City destination ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 7184.58


Effect size
Eta^2: 0.01


------------ status_updates by region_origin ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2223.07


Effect size
Eta^2: 0.0


------------ status_updates by region_destination ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 4646.33


Effect size
Eta^2: 0.0


------------ status_updates by country_origin ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1501.8


Effect size
Eta^2: 0.0


------------ status_updates by continent_destination ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1663.96


Effect size
Eta^2: 0.0


------------ status_updates by country_destination ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 3450.18


Effect size
Eta^2: 0.0


------------ status_updates by Domestic or international ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -27.99


Effect size
Cohen's d: 0.06


------------ status_updates by Region ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 304.81


Effect size
Eta^2: 0.0


------------ status_updates by IncomeGroup ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 164.86


Effect size
Eta^2: 0.0


------------ status_updates by outcome ------------

count    1.007692e+06
mean     1.710792e+00
std      1.205219e+00
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      2.000000e+00
max      1.050000e+02
Name: status_updates, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 924733.87


Effect size
Eta^2: 0.92


------------ total_notifs by origin_city ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_notifs by destination_city ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_notifs by trip_type ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 38.83


Effect size
Cohen's d: 0.11


------------ total_notifs by weekend ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -44.97


Effect size
Cohen's d: 0.11


------------ total_notifs by filter_no_lcc ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -6.41


Effect size
Cohen's d: 0.06


------------ total_notifs by filter_non_stop ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 76.55


Effect size
Cohen's d: 0.24


------------ total_notifs by filter_short_layover ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -27.74


Effect size
Cohen's d: 0.16


------------ total_notifs by filter_name ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 15709.82


Effect size
Eta^2: 0.02


------------ total_notifs by first_rec ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_notifs by last_rec ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_notifs by is_session_1 ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 128.51


Effect size
Cohen's d: 0.26


------------ total_notifs by Search or watch ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 666.89


Effect size
Cohen's d: 1


------------ total_notifs by Use frequency ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 73.11


Effect size
Cohen's d: 0.25


------------ total_notifs by continent_origin ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 201.76


Effect size
Eta^2: 0.0


------------ total_notifs by City origin ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_notifs by City destination ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_notifs by region_origin ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_notifs by region_destination ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 3068.02


Effect size
Eta^2: 0.0


------------ total_notifs by country_origin ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_notifs by continent_destination ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 614.39


Effect size
Eta^2: 0.0


------------ total_notifs by country_destination ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2056.57


Effect size
Eta^2: 0.0


------------ total_notifs by Domestic or international ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -21.96


Effect size
Cohen's d: 0.05


------------ total_notifs by Region ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 280.32


Effect size
Eta^2: 0.0


------------ total_notifs by IncomeGroup ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 114.83


Effect size
Eta^2: 0.0


------------ total_notifs by outcome ------------

count    949529.000000
mean          1.644821
std           3.872667
min           0.000000
25%           0.000000
50%           0.000000
75%           1.000000
max          65.000000
Name: total_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 602301.52


Effect size
Eta^2: 0.63


------------ total_buy_notifs by origin_city ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_buy_notifs by destination_city ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_buy_notifs by trip_type ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 37.01


Effect size
Cohen's d: 0.11


------------ total_buy_notifs by weekend ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -42.34


Effect size
Cohen's d: 0.1


------------ total_buy_notifs by filter_no_lcc ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0107
t-statistic value: 2.55


Effect size
Cohen's d: 0.02


------------ total_buy_notifs by filter_non_stop ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 57.45


Effect size
Cohen's d: 0.18


------------ total_buy_notifs by filter_short_layover ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -16.74


Effect size
Cohen's d: 0.1


------------ total_buy_notifs by filter_name ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 8801.53


Effect size
Eta^2: 0.01


------------ total_buy_notifs by first_rec ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_buy_notifs by last_rec ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_buy_notifs by is_session_1 ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 109.34


Effect size
Cohen's d: 0.22


------------ total_buy_notifs by Search or watch ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 493.36


Effect size
Cohen's d: 0.9


------------ total_buy_notifs by Use frequency ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 66.42


Effect size
Cohen's d: 0.23


------------ total_buy_notifs by continent_origin ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 146.96


Effect size
Eta^2: 0.0


------------ total_buy_notifs by City origin ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_buy_notifs by City destination ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_buy_notifs by region_origin ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_buy_notifs by region_destination ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2358.23


Effect size
Eta^2: 0.0


------------ total_buy_notifs by country_origin ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ total_buy_notifs by continent_destination ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 475.93


Effect size
Eta^2: 0.0


------------ total_buy_notifs by country_destination ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1625.59


Effect size
Eta^2: 0.0


------------ total_buy_notifs by Domestic or international ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -20.78


Effect size
Cohen's d: 0.04


------------ total_buy_notifs by Region ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 77.25


Effect size
Eta^2: 0.0


------------ total_buy_notifs by IncomeGroup ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 37.12


Effect size
Eta^2: 0.0


------------ total_buy_notifs by outcome ------------

count    949529.000000
mean          0.947830
std           2.790264
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max          65.000000
Name: total_buy_notifs, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 385367.86


Effect size
Eta^2: 0.41


------------ first_total by origin_city ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_total by destination_city ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_total by trip_type ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 199.0


Effect size
Cohen's d: 0.58


------------ first_total by weekend ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 178.82


Effect size
Cohen's d: 0.54


------------ first_total by filter_no_lcc ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 10.06


Effect size
Cohen's d: 0.09


------------ first_total by filter_non_stop ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -45.12


Effect size
Cohen's d: 0.15


------------ first_total by filter_short_layover ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -88.63


Effect size
Cohen's d: 0.46


------------ first_total by filter_name ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 12364.14


Effect size
Eta^2: 0.01


------------ first_total by first_rec ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_total by last_rec ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_total by is_session_1 ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 27.99


Effect size
Cohen's d: 0.06


------------ first_total by Search or watch ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -17.95


Effect size
Cohen's d: 0.04


------------ first_total by Use frequency ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 14.18


Effect size
Cohen's d: 0.06


------------ first_total by continent_origin ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 21929.51


Effect size
Eta^2: 0.02


------------ first_total by City origin ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_total by City destination ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_total by region_origin ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_total by region_destination ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 269291.13


Effect size
Eta^2: 0.28


------------ first_total by country_origin ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_total by continent_destination ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 174460.05


Effect size
Eta^2: 0.18


------------ first_total by country_destination ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 235544.03


Effect size
Eta^2: 0.25


------------ first_total by Domestic or international ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 502.59


Effect size
Cohen's d: 1


------------ first_total by Region ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 16344.92


Effect size
Eta^2: 0.02


------------ first_total by IncomeGroup ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 12645.92


Effect size
Eta^2: 0.01


------------ first_total by outcome ------------

count    949529.000000
mean        484.516101
std         393.182042
min           9.000000
25%         229.000000
50%         372.000000
75%         627.000000
max       21539.000000
Name: first_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2640.89


Effect size
Eta^2: 0.0


------------ first_buy_total by origin_city ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by destination_city ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by trip_type ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 170.35


Effect size
Cohen's d: 0.64


------------ first_buy_total by weekend ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 135.64


Effect size
Cohen's d: 0.51


------------ first_buy_total by filter_no_lcc ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.905
t-statistic value: -0.12


Effect size
Cohen's d: 0.0


------------ first_buy_total by filter_non_stop ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -32.36


Effect size
Cohen's d: 0.14


------------ first_buy_total by filter_short_layover ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -68.26


Effect size
Cohen's d: 0.46


------------ first_buy_total by filter_name ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 7039.74


Effect size
Eta^2: 0.01


------------ first_buy_total by first_rec ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by last_rec ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by is_session_1 ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 21.97


Effect size
Cohen's d: 0.06


------------ first_buy_total by Search or watch ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -9.73


Effect size
Cohen's d: 0.03


------------ first_buy_total by Use frequency ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 10.93


Effect size
Cohen's d: 0.05


------------ first_buy_total by continent_origin ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 17713.82


Effect size
Eta^2: 0.03


------------ first_buy_total by City origin ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by City destination ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by region_origin ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by region_destination ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by country_origin ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by continent_destination ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 92026.55


Effect size
Eta^2: 0.15


------------ first_buy_total by country_destination ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy_total by Domestic or international ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 381.21


Effect size
Cohen's d: 0.98


------------ first_buy_total by Region ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 12996.82


Effect size
Eta^2: 0.02


------------ first_buy_total by IncomeGroup ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 7545.73


Effect size
Eta^2: 0.01


------------ first_buy_total by outcome ------------

count    597414.000000
mean        461.218400
std         365.240937
min           9.000000
25%         217.000000
50%         361.000000
75%         602.000000
max       20103.000000
Name: first_buy_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1697.59


Effect size
Eta^2: 0.0


------------ lowest_total by origin_city ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ lowest_total by destination_city ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ lowest_total by trip_type ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 197.48


Effect size
Cohen's d: 0.58


------------ lowest_total by weekend ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 180.91


Effect size
Cohen's d: 0.54


------------ lowest_total by filter_no_lcc ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 10.95


Effect size
Cohen's d: 0.1


------------ lowest_total by filter_non_stop ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -50.05


Effect size
Cohen's d: 0.17


------------ lowest_total by filter_short_layover ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -85.44


Effect size
Cohen's d: 0.44


------------ lowest_total by filter_name ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 12749.45


Effect size
Eta^2: 0.01


------------ lowest_total by first_rec ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ lowest_total by last_rec ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ lowest_total by is_session_1 ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 21.12


Effect size
Cohen's d: 0.04


------------ lowest_total by Search or watch ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -57.16


Effect size
Cohen's d: 0.12


------------ lowest_total by Use frequency ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 10.56


Effect size
Cohen's d: 0.04


------------ lowest_total by continent_origin ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 22158.96


Effect size
Eta^2: 0.02


------------ lowest_total by City origin ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ lowest_total by City destination ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ lowest_total by region_origin ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ lowest_total by region_destination ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 272364.97


Effect size
Eta^2: 0.29


------------ lowest_total by country_origin ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ lowest_total by continent_destination ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 176933.01


Effect size
Eta^2: 0.19


------------ lowest_total by country_destination ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 238448.8


Effect size
Eta^2: 0.25


------------ lowest_total by Domestic or international ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 505.76


Effect size
Cohen's d: 1


------------ lowest_total by Region ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 16657.08


Effect size
Eta^2: 0.02


------------ lowest_total by IncomeGroup ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 12825.1


Effect size
Eta^2: 0.01


------------ lowest_total by outcome ------------

count    949529.000000
mean        472.297550
std         384.511628
min           9.000000
25%         222.000000
50%         362.000000
75%         612.000000
max       21539.000000
Name: lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 4393.07


Effect size
Eta^2: 0.0


------------ session by origin_city ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 16421.34


Effect size
Eta^2: 0.02


------------ session by destination_city ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 8771.51


Effect size
Eta^2: 0.01


------------ session by trip_type ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -62.87


Effect size
Cohen's d: 0.16


------------ session by weekend ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 6.83


Effect size
Cohen's d: 0.02


------------ session by filter_no_lcc ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0004
t-statistic value: 3.54


Effect size
Cohen's d: 0.03


------------ session by filter_non_stop ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -10.5


Effect size
Cohen's d: 0.03


------------ session by filter_short_layover ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 13.59


Effect size
Cohen's d: 0.08


------------ session by filter_name ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 442.38


Effect size
Eta^2: 0.0


------------ session by first_rec ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 3692.19


Effect size
Eta^2: 0.0


------------ session by last_rec ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 3686.38


Effect size
Eta^2: 0.0


------------ session by is_session_1 ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -734.88


Effect size
Cohen's d: 1


------------ session by Search or watch ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -127.0


Effect size
Cohen's d: 0.28


------------ session by Use frequency ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -179.68


Effect size
Cohen's d: 0.94


------------ session by continent_origin ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2797.35


Effect size
Eta^2: 0.0


------------ session by City origin ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 15651.9


Effect size
Eta^2: 0.01


------------ session by City destination ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 8025.67


Effect size
Eta^2: 0.01


------------ session by region_origin ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 11715.2


Effect size
Eta^2: 0.01


------------ session by region_destination ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 4828.28


Effect size
Eta^2: 0.0


------------ session by country_origin ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 7360.07


Effect size
Eta^2: 0.01


------------ session by continent_destination ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 425.24


Effect size
Eta^2: 0.0


------------ session by country_destination ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2742.74


Effect size
Eta^2: 0.0


------------ session by Domestic or international ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 5.12


Effect size
Cohen's d: 0.01


------------ session by Region ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2071.13


Effect size
Eta^2: 0.0


------------ session by IncomeGroup ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 556.74


Effect size
Eta^2: 0.0


------------ session by outcome ------------

count    1.007692e+06
mean     2.616324e+00
std      2.560719e+00
min      1.000000e+00
25%      1.000000e+00
50%      2.000000e+00
75%      3.000000e+00
max      3.000000e+01
Name: session, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 24281.92


Effect size
Eta^2: 0.02


------------ first_buy - lowest_total by origin_city ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by destination_city ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by trip_type ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 20.63


Effect size
Cohen's d: 0.08


------------ first_buy - lowest_total by weekend ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0001
t-statistic value: 3.93


Effect size
Cohen's d: 0.01


------------ first_buy - lowest_total by filter_no_lcc ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -6.78


Effect size
Cohen's d: 0.08


------------ first_buy - lowest_total by filter_non_stop ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 20.64


Effect size
Cohen's d: 0.09


------------ first_buy - lowest_total by filter_short_layover ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -19.99


Effect size
Cohen's d: 0.14


------------ first_buy - lowest_total by filter_name ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 3462.32


Effect size
Eta^2: 0.01


------------ first_buy - lowest_total by first_rec ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by last_rec ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by is_session_1 ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 34.96


Effect size
Cohen's d: 0.09


------------ first_buy - lowest_total by Search or watch ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 152.41


Effect size
Cohen's d: 0.36


------------ first_buy - lowest_total by Use frequency ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 16.69


Effect size
Cohen's d: 0.08


------------ first_buy - lowest_total by continent_origin ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 404.11


Effect size
Eta^2: 0.0


------------ first_buy - lowest_total by City origin ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by City destination ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by region_origin ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by region_destination ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by country_origin ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by continent_destination ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 167.34


Effect size
Eta^2: 0.0


------------ first_buy - lowest_total by country_destination ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ first_buy - lowest_total by Domestic or international ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 30.3


Effect size
Cohen's d: 0.08


------------ first_buy - lowest_total by Region ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 449.33


Effect size
Eta^2: 0.0


------------ first_buy - lowest_total by IncomeGroup ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 53.17


Effect size
Eta^2: 0.0


------------ first_buy - lowest_total by outcome ------------

count    597414.000000
mean         10.216287
std          60.636663
min           0.000000
25%           0.000000
50%           0.000000
75%           0.000000
max       18166.000000
Name: first_buy - lowest_total, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 152072.97


Effect size
Eta^2: 0.25


------------ Adult population by origin_city ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Adult population by destination_city ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Adult population by trip_type ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 49.48


Effect size
Cohen's d: 0.12


------------ Adult population by weekend ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -105.33


Effect size
Cohen's d: 0.29


------------ Adult population by filter_no_lcc ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -17.12


Effect size
Cohen's d: 0.16


------------ Adult population by filter_non_stop ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 25.77


Effect size
Cohen's d: 0.09


------------ Adult population by filter_short_layover ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.8635
t-statistic value: 0.17


Effect size
Cohen's d: 0.0


------------ Adult population by filter_name ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1289.57


Effect size
Eta^2: 0.0


------------ Adult population by first_rec ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2993.03


Effect size
Eta^2: 0.0


------------ Adult population by last_rec ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1623.93


Effect size
Eta^2: 0.0


------------ Adult population by is_session_1 ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -34.19


Effect size
Cohen's d: 0.07


------------ Adult population by Search or watch ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.0055
t-statistic value: 2.77


Effect size
Cohen's d: 0.01


------------ Adult population by Use frequency ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -29.61


Effect size
Cohen's d: 0.11


------------ Adult population by continent_origin ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 398363.35


Effect size
Eta^2: 0.4


------------ Adult population by City origin ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Adult population by City destination ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Adult population by region_origin ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 946871.69


Effect size
Eta^2: 0.94


------------ Adult population by region_destination ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 115214.42


Effect size
Eta^2: 0.11


------------ Adult population by country_origin ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Adult population by continent_destination ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 59581.96


Effect size
Eta^2: 0.06


------------ Adult population by country_destination ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 104364.18


Effect size
Eta^2: 0.1


------------ Adult population by Domestic or international ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -434.63


Effect size
Cohen's d: 0.86


------------ Adult population by Region ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 668332.15


Effect size
Eta^2: 0.66


------------ Adult population by IncomeGroup ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 227648.52


Effect size
Eta^2: 0.23


------------ Adult population by outcome ------------

count    1.007113e+06
mean     1.935103e+08
std      1.088621e+08
min      5.559800e+04
25%      1.129580e+08
50%      2.446359e+08
75%      2.446359e+08
max      1.088715e+09
Name: Adult population, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 21.83


Effect size
Eta^2: 0.0


------------ Count_uniq_users_per_country by origin_city ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1007691.0


Effect size
Eta^2: 1.0


------------ Count_uniq_users_per_country by destination_city ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 181732.34


Effect size
Eta^2: 0.18


------------ Count_uniq_users_per_country by trip_type ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 83.95


Effect size
Cohen's d: 0.22


------------ Count_uniq_users_per_country by weekend ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -146.64


Effect size
Cohen's d: 0.4


------------ Count_uniq_users_per_country by filter_no_lcc ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -22.92


Effect size
Cohen's d: 0.21


------------ Count_uniq_users_per_country by filter_non_stop ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 35.9


Effect size
Cohen's d: 0.12


------------ Count_uniq_users_per_country by filter_short_layover ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 6.52


Effect size
Cohen's d: 0.04


------------ Count_uniq_users_per_country by filter_name ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1711.06


Effect size
Eta^2: 0.0


------------ Count_uniq_users_per_country by first_rec ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 4744.52


Effect size
Eta^2: 0.0


------------ Count_uniq_users_per_country by last_rec ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 2715.34


Effect size
Eta^2: 0.0


------------ Count_uniq_users_per_country by is_session_1 ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -50.19


Effect size
Cohen's d: 0.1


------------ Count_uniq_users_per_country by Search or watch ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 5.28


Effect size
Cohen's d: 0.01


------------ Count_uniq_users_per_country by Use frequency ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -42.39


Effect size
Cohen's d: 0.16


------------ Count_uniq_users_per_country by continent_origin ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 590989.9


Effect size
Eta^2: 0.59


------------ Count_uniq_users_per_country by City origin ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 988774.79


Effect size
Eta^2: 0.98


------------ Count_uniq_users_per_country by City destination ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 174779.99


Effect size
Eta^2: 0.17


------------ Count_uniq_users_per_country by region_origin ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 952334.7


Effect size
Eta^2: 0.95


------------ Count_uniq_users_per_country by region_destination ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 146437.8


Effect size
Eta^2: 0.14


------------ Count_uniq_users_per_country by country_origin ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1007691.0


Effect size
Eta^2: 1.0


------------ Count_uniq_users_per_country by continent_destination ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 84037.0


Effect size
Eta^2: 0.08


------------ Count_uniq_users_per_country by country_destination ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 136805.64


Effect size
Eta^2: 0.14


------------ Count_uniq_users_per_country by Domestic or international ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -572.35


Effect size
Cohen's d: 1


------------ Count_uniq_users_per_country by Region ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 806833.27


Effect size
Eta^2: 0.8


------------ Count_uniq_users_per_country by IncomeGroup ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 351544.97


Effect size
Eta^2: 0.35


------------ Count_uniq_users_per_country by outcome ------------

count    1.007692e+06
mean     5.534014e+05
std      3.181662e+05
min      1.000000e+00
25%      6.405800e+04
50%      7.431010e+05
75%      7.431010e+05
max      7.431010e+05
Name: Count_uniq_users_per_country, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 33.8


Effect size
Eta^2: 0.0


------------ Passengers carried Q1 by origin_city ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64
1324 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Passengers carried Q1 by destination_city ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64
1583 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Passengers carried Q1 by trip_type ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 74.17


Effect size
Cohen's d: 0.19


------------ Passengers carried Q1 by weekend ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -144.67


Effect size
Cohen's d: 0.4


------------ Passengers carried Q1 by filter_no_lcc ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -22.59


Effect size
Cohen's d: 0.21


------------ Passengers carried Q1 by filter_non_stop ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 34.38


Effect size
Cohen's d: 0.12


------------ Passengers carried Q1 by filter_short_layover ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 7.66


Effect size
Cohen's d: 0.04


------------ Passengers carried Q1 by filter_name ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1559.59


Effect size
Eta^2: 0.0


------------ Passengers carried Q1 by first_rec ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 3260.83


Effect size
Eta^2: 0.0


------------ Passengers carried Q1 by last_rec ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 1799.56


Effect size
Eta^2: 0.0


------------ Passengers carried Q1 by is_session_1 ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -50.1


Effect size
Cohen's d: 0.1


------------ Passengers carried Q1 by Search or watch ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: 5.85


Effect size
Cohen's d: 0.01


------------ Passengers carried Q1 by Use frequency ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -42.36


Effect size
Cohen's d: 0.16


------------ Passengers carried Q1 by continent_origin ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 535341.13


Effect size
Eta^2: 0.54


------------ Passengers carried Q1 by City origin ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64
1237 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Passengers carried Q1 by City destination ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64
1487 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Passengers carried Q1 by region_origin ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64
453 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Passengers carried Q1 by region_destination ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64
498 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 139782.78


Effect size
Eta^2: 0.14


------------ Passengers carried Q1 by country_origin ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64
188 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Passengers carried Q1 by continent_destination ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 77861.72


Effect size
Eta^2: 0.08


------------ Passengers carried Q1 by country_destination ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64
209 different categorical values, too many to plot.



 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 132743.64


Effect size
Eta^2: 0.13


------------ Passengers carried Q1 by Domestic or international ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Paired t-test
p_val to reject null: 0.0
t-statistic value: -542.62


Effect size
Cohen's d: 1


------------ Passengers carried Q1 by Region ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Passengers carried Q1 by IncomeGroup ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: nan
H value: nan


Effect size
Eta^2: nan


------------ Passengers carried Q1 by outcome ------------

count    9.892330e+05
mean     1.710664e+08
std      8.932979e+07
min      1.121500e+03
25%      2.222555e+08
50%      2.222555e+08
75%      2.222555e+08
max      2.222555e+08
Name: Passengers carried Q1, dtype: float64

 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 34.93


Effect size
Eta^2: 0.0
In [30]:
numcat_correlations = numcat_correlations.fillna(0)
numcat_correlations.style.background_gradient(cmap='Blues', axis=None)
Out[30]:
stay status_updates total_notifs total_buy_notifs first_total first_buy_total lowest_total session first_buy - lowest_total Adult population Count_uniq_users_per_country Passengers carried Q1
origin_city 0 0.00332128 0 0 0 0 0 0.0150028 0 0 1 0
destination_city 0 0.0059888 0 0 0 0 0 0.00714585 0 0 0.179056 0
trip_type 0 0 0.112452 0.106473 0.583929 0.637136 0.57819 0.159936 0.0812652 0.123811 0.215932 0.193932
weekend 0.684679 0.0824399 0.110737 0.103419 0.538623 0.51105 0.544071 0.0173271 0.014708 0.290729 0.39639 0.395655
filter_no_lcc 0.111479 0.372551 0.057119 0.0233623 0.0926168 0 0.100905 0.0317537 0.083062 0.163234 0.213423 0.212988
filter_non_stop 0.139014 0.390681 0.237666 0.176829 0.154903 0.139313 0.17182 0.0346863 0.0881347 0.0872697 0.121072 0.116993
filter_short_layover 0.171566 0.356703 0.160511 0.0976049 0.456049 0.459056 0.439535 0.0810463 0.136973 0 0.0374698 0.0442471
filter_name 0.00693749 0.0297565 0.0165397 0.00926416 0.0130162 0.0117754 0.013422 0.000434047 0.0057872 0.0012755 0.00169305 0.00157152
first_rec 0.000172745 0.0263566 0 0 0 0 0 0.00366203 0 0.00296992 0.00470633 0.00329431
last_rec 0.000317748 0.0105668 0 0 0 0 0 0.00365626 0 0.00161048 0.00269264 0.00181713
is_session_1 0.018642 0.277057 0.261587 0.222544 0.0574585 0.0568244 0.0433786 1 0.0895381 0.0681421 0.100039 0.100783
Search or watch 0.00695598 1 1 0.895893 0.0386801 0.0259027 0.124391 0.275585 0.360249 0.00582045 0.0110455 0.0123568
Use frequency 0.024059 0.190557 0.254328 0.227433 0.0552827 0.0525285 0.0413538 0.941508 0.0832179 0.109622 0.1579 0.15901
continent_origin 0.033998 0.000299127 0.000207222 0.000149507 0.02309 0.0296428 0.0233317 0.00277105 0.000668066 0.395547 0.586477 0.541166
City origin 0 0.00275011 0 0 0 0 0 0.0143234 0 0 0.981205 0
City destination 0 0.00566344 0 0 0 0 0 0.00649934 0 0 0.172225 0
region_origin 0 0.00175834 0 0 0 0 0 0.0111823 0 0.940158 0.945042 0
region_destination 0 0.0041197 0.0027091 0.00196118 0.28323 0 0.286469 0.00430035 0 0.113964 0.144898 0.140873
country_origin 0 0.001305 0 0 0 0 0 0.00711964 0 0 1 0
continent_destination 0.198041 0.00164631 0.000641784 0.00049596 0.183729 0.154035 0.186334 0.000417032 0.000271742 0.0591565 0.0833911 0.0787046
country_destination 0.240221 0.0032181 0.00194726 0.00149327 0.2479 0 0.250959 0.00251591 0 0.103442 0.135583 0.134006
Domestic or international 0.572023 0.0557807 0.0451278 0.0427133 1 0.984622 1 0.0102012 0.0782495 0.857869 1 1
Region 0.0408892 0.000295541 0.000287851 7.39861e-05 0.0172065 0.0217437 0.0175352 0.00204839 0.00074042 0.66361 0.800674 0
IncomeGroup 0.023031 0.000159634 0.00011672 3.48849e-05 0.013314 0.0126241 0.0135027 0.000548522 8.22999e-05 0.226038 0.348859 0
outcome 0.0025015 0.917676 0.634316 0.405851 0.00277916 0.00283824 0.00462448 0.0240947 0.25455 1.96906e-05 3.15623e-05 3.32909e-05

Go to index

Example of how to use the data presented in these tables

total_notifs vs outcome

In the table of numerical vs categorical variables, we see that for:

  • outcome vs. total_notifs, eta\^2 = 0.63

This is fairly high, so we investigate what may be causing it.

In [31]:
numcat_correlations.iloc[24:, 0:5].style.background_gradient(cmap='Blues', axis=None)
Out[31]:
stay status_updates total_notifs total_buy_notifs first_total
outcome 0.0025015 0.917676 0.634316 0.405851 0.00277916

a) Subset only "watches"

Since we know that 99% of the transactions that are *Search* are also *gained*, there is little information to be gained from the *search* rows. Test restricting the analysis to the subset of entries that are *Watch* and leave *Search* out.

b) Subset only "gained" or "lost"

Also, since we are interested in **outcome** that is only either *gained* or *lost*, we can also filter out those transactions for which the outcome is still unkown, coded as *expected*.


c) Subset only "watches" AND Subset only "gained" or "lost"

In [32]:
# Conditions
watch = df['Search or watch']=='watch'
gain_lost = (df['outcome'] == 'gained') | (df['outcome'] == 'lost')

total_notifs vs outcome

a) Only watch

In [33]:
summarize_numerical_categorical(df[watch], num = "total_notifs", cat = "outcome", summary=False, hist=False)

------------ total_notifs by outcome ------------


 Anova - Kruskal-Wallis
p_val to reject null: 0.0
H value: 32606.13


Effect size
Eta^2: 0.1
Out[33]:
(0.0, 0.09505439705830593)

Eta^2 has now reduced from 0.63 to 0.1.

Unsurprising, since search rows always have 0 total_notifs.

In [34]:
df['total_notifs'][df['Search or watch']=='search'].describe()
Out[34]:
count    606521.0
mean          0.0
std           0.0
min           0.0
25%           0.0
50%           0.0
75%           0.0
max           0.0
Name: total_notifs, dtype: float64

b) Only gained and lost.

In [35]:
summarize_numerical_categorical(df[gain_lost], num = "total_notifs", cat = "outcome", summary=False, hist=False)

------------ total_notifs by outcome ------------


 Paired t-test
p_val to reject null: 0.0
t-statistic value: 538.21


Effect size
Cohen's d: 0.98
Out[35]:
(0.0, 0.9758774311114617)

This is a remarkable effect size, when we exclude expected we see that total_notifs has a big impact on whether the outcome is gained or lost.

HOWEVER, this is being caused by the fact that search and watch entries are combined. The fact that total_notifs is always 0 for the search entries and that most gained entries are also search, pulls the mean of gained down, thus creating a larger difference than what is owed exclusively to any effects of total_notfs on outcome.

c) Subset only "watches" AND Subset only "gained" or "lost"

In [36]:
summarize_numerical_categorical(df[gain_lost & watch], num = "total_notifs", cat = "outcome", summary=False, hist=False)

------------ total_notifs by outcome ------------


 Paired t-test
p_val to reject null: 0.0
t-statistic value: 26.75


Effect size
Cohen's d: 0.69
Out[36]:
(2.4723858106533087e-157, 0.6888764109407535)
In [37]:
df['Search or watch'][gain_lost & watch].value_counts()
Out[37]:
watch     215273
search         0
Name: Search or watch, dtype: int64
In [38]:
df['outcome'][gain_lost & watch].value_counts()
Out[38]:
lost        212463
gained        2810
expected         0
Name: outcome, dtype: int64